Dataset statistics
| Number of variables | 45 |
|---|---|
| Number of observations | 57588 |
| Missing cells | 44103 |
| Missing cells (%) | 1.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 118.7 MiB |
| Average record size in memory | 2.1 KiB |
Variable types
| CAT | 30 |
|---|---|
| NUM | 13 |
| BOOL | 2 |
Reproduction
| Analysis started | 2020-07-11 23:48:09.973832 |
|---|---|
| Analysis finished | 2020-07-11 23:49:20.284806 |
| Duration | 1 minute and 10.31 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
recorded_by has constant value "GeoData Consultants Ltd" | Constant |
date_recorded has a high cardinality: 353 distinct values | High cardinality |
funder has a high cardinality: 1858 distinct values | High cardinality |
installer has a high cardinality: 2113 distinct values | High cardinality |
wpt_name has a high cardinality: 36720 distinct values | High cardinality |
subvillage has a high cardinality: 18567 distinct values | High cardinality |
lga has a high cardinality: 124 distinct values | High cardinality |
ward has a high cardinality: 2033 distinct values | High cardinality |
scheme_name has a high cardinality: 2658 distinct values | High cardinality |
geometry has a high cardinality: 57519 distinct values | High cardinality |
x is highly correlated with longitude | High correlation |
longitude is highly correlated with x | High correlation |
y is highly correlated with latitude | High correlation |
latitude is highly correlated with y | High correlation |
extraction_type_group is highly correlated with extraction_type and 1 other fields | High correlation |
extraction_type is highly correlated with extraction_type_group and 1 other fields | High correlation |
extraction_type_class is highly correlated with extraction_type and 1 other fields | High correlation |
management_group is highly correlated with management | High correlation |
management is highly correlated with management_group | High correlation |
payment_type is highly correlated with payment | High correlation |
payment is highly correlated with payment_type | High correlation |
quality_group is highly correlated with water_quality | High correlation |
water_quality is highly correlated with quality_group | High correlation |
quantity_group is highly correlated with quantity | High correlation |
quantity is highly correlated with quantity_group | High correlation |
source_type is highly correlated with source and 1 other fields | High correlation |
source is highly correlated with source_type and 1 other fields | High correlation |
source_class is highly correlated with source and 1 other fields | High correlation |
waterpoint_type_group is highly correlated with waterpoint_type | High correlation |
waterpoint_type is highly correlated with waterpoint_type_group | High correlation |
funder has 3622 (6.3%) missing values | Missing |
installer has 3636 (6.3%) missing values | Missing |
public_meeting has 2976 (5.2%) missing values | Missing |
scheme_management has 3750 (6.5%) missing values | Missing |
scheme_name has 26692 (46.3%) missing values | Missing |
permit has 3056 (5.3%) missing values | Missing |
amount_tsh is highly skewed (γ1 = 56.93966707) | Skewed |
num_private is highly skewed (γ1 = 90.52355548) | Skewed |
geometry is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
id has unique values | Unique |
amount_tsh has 39827 (69.2%) zeros | Zeros |
gps_height has 18626 (32.3%) zeros | Zeros |
num_private has 56831 (98.7%) zeros | Zeros |
population has 19569 (34.0%) zeros | Zeros |
construction_year has 18897 (32.8%) zeros | Zeros |
| Distinct count | 57588 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29690.43078419115 |
|---|---|
| Minimum | 0 |
| Maximum | 59399 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2980.35 |
| Q1 | 14825.75 |
| median | 29688.5 |
| Q3 | 44542.25 |
| 95-th percentile | 56421.65 |
| Maximum | 59399 |
| Range | 59399 |
| Interquartile range (IQR) | 29716.5 |
Descriptive statistics
| Standard deviation | 17147.06679 |
|---|---|
| Coefficient of variation (CV) | 0.5775283933 |
| Kurtosis | -1.200257886 |
| Mean | 29690.43078 |
| Median Absolute Deviation (MAD) | 14858.5 |
| Skewness | 0.0006560063793 |
| Sum | 1709812528 |
| Variance | 294021899.4 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 51660 | 1 | < 0.1% | |
| 57993 | 1 | < 0.1% | |
| 37511 | 1 | < 0.1% | |
| 39558 | 1 | < 0.1% | |
| 33413 | 1 | < 0.1% | |
| 35460 | 1 | < 0.1% | |
| 45699 | 1 | < 0.1% | |
| 41601 | 1 | < 0.1% | |
| 43648 | 1 | < 0.1% | |
| 21087 | 1 | < 0.1% | |
| 23134 | 1 | < 0.1% | |
| 16989 | 1 | < 0.1% | |
| 19036 | 1 | < 0.1% | |
| 29275 | 1 | < 0.1% | |
| 31322 | 1 | < 0.1% | |
| 25177 | 1 | < 0.1% | |
| 4695 | 1 | < 0.1% | |
| 6742 | 1 | < 0.1% | |
| 597 | 1 | < 0.1% | |
| 2644 | 1 | < 0.1% | |
| 12883 | 1 | < 0.1% | |
| 14930 | 1 | < 0.1% | |
| 8785 | 1 | < 0.1% | |
| 10832 | 1 | < 0.1% | |
| Other values (57563) | 57563 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 59399 | 1 | < 0.1% | |
| 59398 | 1 | < 0.1% | |
| 59397 | 1 | < 0.1% | |
| 59396 | 1 | < 0.1% | |
| 59395 | 1 | < 0.1% | |
| 59394 | 1 | < 0.1% | |
| 59393 | 1 | < 0.1% | |
| 59392 | 1 | < 0.1% | |
| 59391 | 1 | < 0.1% | |
| 59390 | 1 | < 0.1% |
| Distinct count | 57588 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37106.48807043134 |
|---|---|
| Minimum | 0 |
| Maximum | 74247 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3726.7 |
| Q1 | 18522.75 |
| median | 37054.5 |
| Q3 | 55667.25 |
| 95-th percentile | 70541.65 |
| Maximum | 74247 |
| Range | 74247 |
| Interquartile range (IQR) | 37144.5 |
Descriptive statistics
| Standard deviation | 21454.51421 |
|---|---|
| Coefficient of variation (CV) | 0.5781876789 |
| Kurtosis | -1.201821343 |
| Mean | 37106.48807 |
| Median Absolute Deviation (MAD) | 18569.5 |
| Skewness | 0.002243961664 |
| Sum | 2136888435 |
| Variance | 460296180 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 20959 | 1 | < 0.1% | |
| 4759 | 1 | < 0.1% | |
| 661 | 1 | < 0.1% | |
| 2708 | 1 | < 0.1% | |
| 12947 | 1 | < 0.1% | |
| 14994 | 1 | < 0.1% | |
| 8849 | 1 | < 0.1% | |
| 10896 | 1 | < 0.1% | |
| 53903 | 1 | < 0.1% | |
| 55950 | 1 | < 0.1% | |
| 49805 | 1 | < 0.1% | |
| 51852 | 1 | < 0.1% | |
| 62091 | 1 | < 0.1% | |
| 64138 | 1 | < 0.1% | |
| 57993 | 1 | < 0.1% | |
| 60040 | 1 | < 0.1% | |
| 33413 | 1 | < 0.1% | |
| 35460 | 1 | < 0.1% | |
| 45699 | 1 | < 0.1% | |
| 41601 | 1 | < 0.1% | |
| 43648 | 1 | < 0.1% | |
| 70263 | 1 | < 0.1% | |
| 72310 | 1 | < 0.1% | |
| 68212 | 1 | < 0.1% | |
| Other values (57563) | 57563 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 74247 | 1 | < 0.1% | |
| 74246 | 1 | < 0.1% | |
| 74243 | 1 | < 0.1% | |
| 74242 | 1 | < 0.1% | |
| 74240 | 1 | < 0.1% | |
| 74239 | 1 | < 0.1% | |
| 74238 | 1 | < 0.1% | |
| 74237 | 1 | < 0.1% | |
| 74236 | 1 | < 0.1% | |
| 74235 | 1 | < 0.1% |
| Distinct count | 98 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 327.64521862193516 |
|---|---|
| Minimum | 0.0 |
| Maximum | 350000.0 |
| Zeros | 39827 |
| Zeros (%) | 69.2% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 30 |
| 95-th percentile | 1200 |
| Maximum | 350000 |
| Range | 350000 |
| Interquartile range (IQR) | 30 |
Descriptive statistics
| Standard deviation | 3043.831403 |
|---|---|
| Coefficient of variation (CV) | 9.290022347 |
| Kurtosis | 4756.496721 |
| Mean | 327.6452186 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 56.93966707 |
| Sum | 18868432.85 |
| Variance | 9264909.609 |
| Value | Count | Frequency (%) | |
| 0 | 39827 | 69.2% | |
| 500 | 3102 | 5.4% | |
| 50 | 2472 | 4.3% | |
| 1000 | 1488 | 2.6% | |
| 20 | 1463 | 2.5% | |
| 200 | 1220 | 2.1% | |
| 100 | 816 | 1.4% | |
| 10 | 806 | 1.4% | |
| 30 | 743 | 1.3% | |
| 2000 | 704 | 1.2% | |
| 250 | 569 | 1.0% | |
| 300 | 557 | 1.0% | |
| 5000 | 450 | 0.8% | |
| 5 | 376 | 0.7% | |
| 25 | 356 | 0.6% | |
| 3000 | 334 | 0.6% | |
| 1200 | 267 | 0.5% | |
| 1500 | 197 | 0.3% | |
| 6 | 190 | 0.3% | |
| 600 | 176 | 0.3% | |
| 4000 | 156 | 0.3% | |
| 2400 | 145 | 0.3% | |
| 2500 | 139 | 0.2% | |
| 6000 | 125 | 0.2% | |
| 7 | 69 | 0.1% | |
| Other values (73) | 841 | 1.5% |
| Value | Count | Frequency (%) | |
| 0 | 39827 | 69.2% | |
| 0.2 | 3 | < 0.1% | |
| 0.25 | 1 | < 0.1% | |
| 1 | 3 | < 0.1% | |
| 2 | 13 | < 0.1% | |
| 5 | 376 | 0.7% | |
| 6 | 190 | 0.3% | |
| 7 | 69 | 0.1% | |
| 9 | 1 | < 0.1% | |
| 10 | 806 | 1.4% |
| Value | Count | Frequency (%) | |
| 350000 | 1 | < 0.1% | |
| 250000 | 1 | < 0.1% | |
| 200000 | 1 | < 0.1% | |
| 170000 | 1 | < 0.1% | |
| 138000 | 1 | < 0.1% | |
| 120000 | 1 | < 0.1% | |
| 117000 | 7 | < 0.1% | |
| 100000 | 3 | < 0.1% | |
| 70000 | 1 | < 0.1% | |
| 60000 | 1 | < 0.1% |
| Distinct count | 353 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| 2011-03-15 | 572 |
|---|---|
| 2011-03-17 | 558 |
| 2013-02-03 | 545 |
| 2011-03-14 | 520 |
| 2011-03-16 | 513 |
| Other values (348) |
| Value | Count | Frequency (%) | |
| 2011-03-15 | 572 | 1.0% | |
| 2011-03-17 | 558 | 1.0% | |
| 2013-02-03 | 545 | 0.9% | |
| 2011-03-14 | 520 | 0.9% | |
| 2011-03-16 | 513 | 0.9% | |
| 2011-03-18 | 497 | 0.9% | |
| 2011-03-19 | 466 | 0.8% | |
| 2011-03-04 | 458 | 0.8% | |
| 2011-03-05 | 434 | 0.8% | |
| 2013-01-24 | 433 | 0.8% | |
| 2013-03-15 | 428 | 0.7% | |
| 2013-02-14 | 427 | 0.7% | |
| 2011-03-11 | 426 | 0.7% | |
| 2013-01-29 | 418 | 0.7% | |
| 2011-03-23 | 417 | 0.7% | |
| 2011-03-09 | 416 | 0.7% | |
| 2013-02-04 | 411 | 0.7% | |
| 2013-02-15 | 399 | 0.7% | |
| 2011-03-30 | 391 | 0.7% | |
| 2013-02-26 | 391 | 0.7% | |
| 2011-03-24 | 381 | 0.7% | |
| 2013-02-16 | 381 | 0.7% | |
| 2013-03-19 | 381 | 0.7% | |
| 2013-02-13 | 380 | 0.7% | |
| 2013-01-30 | 380 | 0.7% | |
| Other values (328) | 46565 | 80.9% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 0 | 134947 | 23.4% | |
| 1 | 125003 | 21.7% | |
| - | 115176 | 20.0% | |
| 2 | 100193 | 17.4% | |
| 3 | 51822 | 9.0% | |
| 7 | 12444 | 2.2% | |
| 4 | 10491 | 1.8% | |
| 8 | 8920 | 1.5% | |
| 6 | 5943 | 1.0% | |
| 5 | 5838 | 1.0% | |
| 9 | 5103 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 460704 | 80.0% | |
| Dash Punctuation | 115176 | 20.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 0 | 134947 | 29.3% | |
| 1 | 125003 | 27.1% | |
| 2 | 100193 | 21.7% | |
| 3 | 51822 | 11.2% | |
| 7 | 12444 | 2.7% | |
| 4 | 10491 | 2.3% | |
| 8 | 8920 | 1.9% | |
| 6 | 5943 | 1.3% | |
| 5 | 5838 | 1.3% | |
| 9 | 5103 | 1.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 115176 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 575880 | 100.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 0 | 134947 | 23.4% | |
| 1 | 125003 | 21.7% | |
| - | 115176 | 20.0% | |
| 2 | 100193 | 17.4% | |
| 3 | 51822 | 9.0% | |
| 7 | 12444 | 2.2% | |
| 4 | 10491 | 1.8% | |
| 8 | 8920 | 1.5% | |
| 6 | 5943 | 1.0% | |
| 5 | 5838 | 1.0% | |
| 9 | 5103 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 575880 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 0 | 134947 | 23.4% | |
| 1 | 125003 | 21.7% | |
| - | 115176 | 20.0% | |
| 2 | 100193 | 17.4% | |
| 3 | 51822 | 9.0% | |
| 7 | 12444 | 2.2% | |
| 4 | 10491 | 1.8% | |
| 8 | 8920 | 1.5% | |
| 6 | 5943 | 1.0% | |
| 5 | 5838 | 1.0% | |
| 9 | 5103 | 0.9% |
| Distinct count | 1858 |
|---|---|
| Unique (%) | 3.4% |
| Missing | 3622 |
| Missing (%) | 6.3% |
| Memory size | 450.0 KiB |
| Government Of Tanzania | |
|---|---|
| Danida | 3114 |
| Hesawa | 1914 |
| World Bank | 1345 |
| Kkkt | 1287 |
| Other values (1853) |
| Value | Count | Frequency (%) | |
| Government Of Tanzania | 8842 | 15.4% | |
| Danida | 3114 | 5.4% | |
| Hesawa | 1914 | 3.3% | |
| World Bank | 1345 | 2.3% | |
| Kkkt | 1287 | 2.2% | |
| World Vision | 1224 | 2.1% | |
| Rwssp | 1187 | 2.1% | |
| Unicef | 1035 | 1.8% | |
| District Council | 843 | 1.5% | |
| Tasaf | 834 | 1.4% | |
| Dhv | 829 | 1.4% | |
| Private Individual | 824 | 1.4% | |
| 0 | 777 | 1.3% | |
| Norad | 765 | 1.3% | |
| Germany Republi | 610 | 1.1% | |
| Tcrs | 602 | 1.0% | |
| Ministry Of Water | 590 | 1.0% | |
| Water | 583 | 1.0% | |
| Dwe | 484 | 0.8% | |
| Netherlands | 461 | 0.8% | |
| Hifab | 450 | 0.8% | |
| Adb | 448 | 0.8% | |
| Lga | 442 | 0.8% | |
| Amref | 425 | 0.7% | |
| Fini Water | 393 | 0.7% | |
| Other values (1833) | 23658 | 41.1% | |
| (Missing) | 3622 | 6.3% |
Length
| Max length | 30 |
|---|---|
| Median length | 6 |
| Mean length | 9.563693825 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 70095 | 12.7% | |
| n | 63776 | 11.6% | |
| i | 37442 | 6.8% | |
| e | 36531 | 6.6% | |
| 34053 | 6.2% | ||
| r | 27524 | 5.0% | |
| t | 22550 | 4.1% | |
| o | 22317 | 4.1% | |
| s | 15843 | 2.9% | |
| d | 15267 | 2.8% | |
| f | 15017 | 2.7% | |
| m | 14835 | 2.7% | |
| v | 12694 | 2.3% | |
| T | 11806 | 2.1% | |
| l | 10992 | 2.0% | |
| G | 10462 | 1.9% | |
| O | 10362 | 1.9% | |
| z | 9437 | 1.7% | |
| c | 9119 | 1.7% | |
| u | 7860 | 1.4% | |
| D | 7427 | 1.3% | |
| W | 7193 | 1.3% | |
| w | 6936 | 1.3% | |
| k | 6489 | 1.2% | |
| p | 6196 | 1.1% | |
| Other values (44) | 58531 | 10.6% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 425935 | 77.3% | |
| Uppercase Letter | 87291 | 15.8% | |
| Space Separator | 34053 | 6.2% | |
| Other Punctuation | 1317 | 0.2% | |
| Decimal Number | 801 | 0.1% | |
| Open Punctuation | 436 | 0.1% | |
| Close Punctuation | 431 | 0.1% | |
| Dash Punctuation | 323 | 0.1% | |
| Connector Punctuation | 167 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| T | 11806 | 13.5% | |
| G | 10462 | 12.0% | |
| O | 10362 | 11.9% | |
| D | 7427 | 8.5% | |
| W | 7193 | 8.2% | |
| C | 4662 | 5.3% | |
| R | 4236 | 4.9% | |
| H | 3139 | 3.6% | |
| M | 3131 | 3.6% | |
| K | 2948 | 3.4% | |
| A | 2891 | 3.3% | |
| S | 2631 | 3.0% | |
| I | 2420 | 2.8% | |
| B | 2050 | 2.3% | |
| N | 2010 | 2.3% | |
| P | 1922 | 2.2% | |
| U | 1855 | 2.1% | |
| V | 1772 | 2.0% | |
| L | 1404 | 1.6% | |
| F | 1379 | 1.6% | |
| J | 795 | 0.9% | |
| E | 435 | 0.5% | |
| Y | 233 | 0.3% | |
| Q | 111 | 0.1% | |
| Z | 16 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 70095 | 16.5% | |
| n | 63776 | 15.0% | |
| i | 37442 | 8.8% | |
| e | 36531 | 8.6% | |
| r | 27524 | 6.5% | |
| t | 22550 | 5.3% | |
| o | 22317 | 5.2% | |
| s | 15843 | 3.7% | |
| d | 15267 | 3.6% | |
| f | 15017 | 3.5% | |
| m | 14835 | 3.5% | |
| v | 12694 | 3.0% | |
| l | 10992 | 2.6% | |
| z | 9437 | 2.2% | |
| c | 9119 | 2.1% | |
| u | 7860 | 1.8% | |
| w | 6936 | 1.6% | |
| k | 6489 | 1.5% | |
| p | 6196 | 1.5% | |
| h | 5677 | 1.3% | |
| g | 3035 | 0.7% | |
| b | 2727 | 0.6% | |
| y | 2670 | 0.6% | |
| x | 565 | 0.1% | |
| j | 310 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 34053 | 100.0% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 434 | 99.5% | |
| [ | 2 | 0.5% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 429 | 99.5% | |
| ] | 2 | 0.5% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 783 | 59.5% | |
| . | 469 | 35.6% | |
| \ | 33 | 2.5% | |
| & | 21 | 1.6% | |
| ' | 11 | 0.8% |
Most frequent Connector Punctuation characters
| Value | Count | Frequency (%) | |
| _ | 167 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 0 | 791 | 98.8% | |
| 2 | 5 | 0.6% | |
| 1 | 2 | 0.2% | |
| 9 | 2 | 0.2% | |
| 4 | 1 | 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 323 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 513226 | 93.2% | |
| Common | 37528 | 6.8% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 70095 | 13.7% | |
| n | 63776 | 12.4% | |
| i | 37442 | 7.3% | |
| e | 36531 | 7.1% | |
| r | 27524 | 5.4% | |
| t | 22550 | 4.4% | |
| o | 22317 | 4.3% | |
| s | 15843 | 3.1% | |
| d | 15267 | 3.0% | |
| f | 15017 | 2.9% | |
| m | 14835 | 2.9% | |
| v | 12694 | 2.5% | |
| T | 11806 | 2.3% | |
| l | 10992 | 2.1% | |
| G | 10462 | 2.0% | |
| O | 10362 | 2.0% | |
| z | 9437 | 1.8% | |
| c | 9119 | 1.8% | |
| u | 7860 | 1.5% | |
| D | 7427 | 1.4% | |
| W | 7193 | 1.4% | |
| w | 6936 | 1.4% | |
| k | 6489 | 1.3% | |
| p | 6196 | 1.2% | |
| h | 5677 | 1.1% | |
| Other values (27) | 49379 | 9.6% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 34053 | 90.7% | ||
| 0 | 791 | 2.1% | |
| / | 783 | 2.1% | |
| . | 469 | 1.2% | |
| ( | 434 | 1.2% | |
| ) | 429 | 1.1% | |
| - | 323 | 0.9% | |
| _ | 167 | 0.4% | |
| \ | 33 | 0.1% | |
| & | 21 | 0.1% | |
| ' | 11 | < 0.1% | |
| 2 | 5 | < 0.1% | |
| 1 | 2 | < 0.1% | |
| [ | 2 | < 0.1% | |
| ] | 2 | < 0.1% | |
| 9 | 2 | < 0.1% | |
| 4 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 550754 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 70095 | 12.7% | |
| n | 63776 | 11.6% | |
| i | 37442 | 6.8% | |
| e | 36531 | 6.6% | |
| 34053 | 6.2% | ||
| r | 27524 | 5.0% | |
| t | 22550 | 4.1% | |
| o | 22317 | 4.1% | |
| s | 15843 | 2.9% | |
| d | 15267 | 2.8% | |
| f | 15017 | 2.7% | |
| m | 14835 | 2.7% | |
| v | 12694 | 2.3% | |
| T | 11806 | 2.1% | |
| l | 10992 | 2.0% | |
| G | 10462 | 1.9% | |
| O | 10362 | 1.9% | |
| z | 9437 | 1.7% | |
| c | 9119 | 1.7% | |
| u | 7860 | 1.4% | |
| D | 7427 | 1.3% | |
| W | 7193 | 1.3% | |
| w | 6936 | 1.3% | |
| k | 6489 | 1.2% | |
| p | 6196 | 1.1% | |
| Other values (44) | 58531 | 10.6% |
| Distinct count | 2428 |
|---|---|
| Unique (%) | 4.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 689.3251371813573 |
|---|---|
| Minimum | -90 |
| Maximum | 2770 |
| Zeros | 18626 |
| Zeros (%) | 32.3% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | -90 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 426 |
| Q3 | 1332 |
| 95-th percentile | 1803 |
| Maximum | 2770 |
| Range | 2860 |
| Interquartile range (IQR) | 1332 |
Descriptive statistics
| Standard deviation | 693.564188 |
|---|---|
| Coefficient of variation (CV) | 1.006149567 |
| Kurtosis | -1.326008097 |
| Mean | 689.3251372 |
| Median Absolute Deviation (MAD) | 426 |
| Skewness | 0.4131933762 |
| Sum | 39696856 |
| Variance | 481031.2829 |
| Value | Count | Frequency (%) | |
| 0 | 18626 | 32.3% | |
| -15 | 60 | 0.1% | |
| -16 | 55 | 0.1% | |
| -13 | 55 | 0.1% | |
| -20 | 52 | 0.1% | |
| 1290 | 52 | 0.1% | |
| -14 | 51 | 0.1% | |
| 303 | 51 | 0.1% | |
| -18 | 49 | 0.1% | |
| -19 | 47 | 0.1% | |
| 1269 | 46 | 0.1% | |
| 1295 | 46 | 0.1% | |
| 1304 | 45 | 0.1% | |
| -23 | 45 | 0.1% | |
| 280 | 44 | 0.1% | |
| 1538 | 44 | 0.1% | |
| 1286 | 44 | 0.1% | |
| -8 | 44 | 0.1% | |
| -17 | 44 | 0.1% | |
| 1332 | 43 | 0.1% | |
| 320 | 43 | 0.1% | |
| 1317 | 42 | 0.1% | |
| 1293 | 42 | 0.1% | |
| 1319 | 42 | 0.1% | |
| 1359 | 42 | 0.1% | |
| Other values (2403) | 37834 | 65.7% |
| Value | Count | Frequency (%) | |
| -90 | 1 | < 0.1% | |
| -63 | 2 | < 0.1% | |
| -59 | 1 | < 0.1% | |
| -57 | 1 | < 0.1% | |
| -55 | 1 | < 0.1% | |
| -54 | 1 | < 0.1% | |
| -53 | 1 | < 0.1% | |
| -52 | 2 | < 0.1% | |
| -51 | 2 | < 0.1% | |
| -50 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2770 | 1 | < 0.1% | |
| 2628 | 1 | < 0.1% | |
| 2627 | 1 | < 0.1% | |
| 2626 | 2 | < 0.1% | |
| 2623 | 1 | < 0.1% | |
| 2614 | 1 | < 0.1% | |
| 2585 | 1 | < 0.1% | |
| 2576 | 1 | < 0.1% | |
| 2569 | 1 | < 0.1% | |
| 2568 | 1 | < 0.1% |
| Distinct count | 2113 |
|---|---|
| Unique (%) | 3.9% |
| Missing | 3636 |
| Missing (%) | 6.3% |
| Memory size | 450.0 KiB |
| DWE | |
|---|---|
| Government | 1670 |
| RWE | 1181 |
| Commu | 1060 |
| DANIDA | 1050 |
| Other values (2108) |
| Value | Count | Frequency (%) | |
| DWE | 16255 | 28.2% | |
| Government | 1670 | 2.9% | |
| RWE | 1181 | 2.1% | |
| Commu | 1060 | 1.8% | |
| DANIDA | 1050 | 1.8% | |
| KKKT | 897 | 1.6% | |
| Hesawa | 803 | 1.4% | |
| 0 | 777 | 1.3% | |
| TCRS | 707 | 1.2% | |
| Central government | 619 | 1.1% | |
| CES | 610 | 1.1% | |
| DANID | 552 | 1.0% | |
| District Council | 551 | 1.0% | |
| Community | 539 | 0.9% | |
| HESAWA | 537 | 0.9% | |
| World vision | 408 | 0.7% | |
| LGA | 408 | 0.7% | |
| WEDECO | 397 | 0.7% | |
| District council | 392 | 0.7% | |
| Gover | 383 | 0.7% | |
| TASAF | 377 | 0.7% | |
| AMREF | 329 | 0.6% | |
| TWESA | 316 | 0.5% | |
| WU | 301 | 0.5% | |
| Dmdd | 287 | 0.5% | |
| Other values (2088) | 22546 | 39.2% | |
| (Missing) | 3636 | 6.3% |
Length
| Max length | 30 |
|---|---|
| Median length | 4 |
| Mean length | 5.962926304 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| D | 26366 | 7.7% | |
| W | 24498 | 7.1% | |
| E | 24200 | 7.0% | |
| n | 23264 | 6.8% | |
| a | 20721 | 6.0% | |
| e | 15083 | 4.4% | |
| i | 14923 | 4.3% | |
| A | 13487 | 3.9% | |
| r | 13139 | 3.8% | |
| t | 12622 | 3.7% | |
| 12572 | 3.7% | ||
| o | 12111 | 3.5% | |
| C | 10452 | 3.0% | |
| m | 9090 | 2.6% | |
| S | 6624 | 1.9% | |
| R | 6475 | 1.9% | |
| l | 6119 | 1.8% | |
| s | 6105 | 1.8% | |
| I | 5982 | 1.7% | |
| T | 5823 | 1.7% | |
| u | 5416 | 1.6% | |
| K | 5375 | 1.6% | |
| c | 4815 | 1.4% | |
| N | 4632 | 1.3% | |
| G | 4290 | 1.2% | |
| Other values (44) | 49209 | 14.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 166234 | 48.4% | |
| Uppercase Letter | 162228 | 47.2% | |
| Space Separator | 12572 | 3.7% | |
| Other Punctuation | 964 | 0.3% | |
| Decimal Number | 781 | 0.2% | |
| Dash Punctuation | 268 | 0.1% | |
| Connector Punctuation | 169 | < 0.1% | |
| Open Punctuation | 159 | < 0.1% | |
| Close Punctuation | 16 | < 0.1% | |
| Currency Symbol | 2 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| D | 26366 | 16.3% | |
| W | 24498 | 15.1% | |
| E | 24200 | 14.9% | |
| A | 13487 | 8.3% | |
| C | 10452 | 6.4% | |
| S | 6624 | 4.1% | |
| R | 6475 | 4.0% | |
| I | 5982 | 3.7% | |
| T | 5823 | 3.6% | |
| K | 5375 | 3.3% | |
| N | 4632 | 2.9% | |
| G | 4290 | 2.6% | |
| M | 4226 | 2.6% | |
| H | 3379 | 2.1% | |
| F | 3089 | 1.9% | |
| O | 3088 | 1.9% | |
| L | 2351 | 1.4% | |
| U | 2226 | 1.4% | |
| P | 1883 | 1.2% | |
| V | 1476 | 0.9% | |
| B | 794 | 0.5% | |
| J | 725 | 0.4% | |
| X | 356 | 0.2% | |
| Y | 245 | 0.2% | |
| Z | 128 | 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 23264 | 14.0% | |
| a | 20721 | 12.5% | |
| e | 15083 | 9.1% | |
| i | 14923 | 9.0% | |
| r | 13139 | 7.9% | |
| t | 12622 | 7.6% | |
| o | 12111 | 7.3% | |
| m | 9090 | 5.5% | |
| l | 6119 | 3.7% | |
| s | 6105 | 3.7% | |
| u | 5416 | 3.3% | |
| c | 4815 | 2.9% | |
| v | 4274 | 2.6% | |
| d | 4178 | 2.5% | |
| w | 3293 | 2.0% | |
| g | 2670 | 1.6% | |
| y | 1769 | 1.1% | |
| h | 1696 | 1.0% | |
| p | 1419 | 0.9% | |
| k | 1392 | 0.8% | |
| f | 802 | 0.5% | |
| b | 503 | 0.3% | |
| j | 482 | 0.3% | |
| z | 320 | 0.2% | |
| x | 14 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 12572 | 100.0% |
Most frequent Connector Punctuation characters
| Value | Count | Frequency (%) | |
| _ | 169 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 669 | 69.4% | |
| . | 236 | 24.5% | |
| & | 48 | 5.0% | |
| ' | 11 | 1.1% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 0 | 778 | 99.6% | |
| 1 | 1 | 0.1% | |
| 4 | 1 | 0.1% | |
| 9 | 1 | 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 268 | 100.0% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 157 | 98.7% | |
| [ | 2 | 1.3% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| } | 13 | 81.2% | |
| ] | 2 | 12.5% | |
| ) | 1 | 6.2% |
Most frequent Currency Symbol characters
| Value | Count | Frequency (%) | |
| $ | 2 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 328462 | 95.7% | |
| Common | 14931 | 4.3% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| D | 26366 | 8.0% | |
| W | 24498 | 7.5% | |
| E | 24200 | 7.4% | |
| n | 23264 | 7.1% | |
| a | 20721 | 6.3% | |
| e | 15083 | 4.6% | |
| i | 14923 | 4.5% | |
| A | 13487 | 4.1% | |
| r | 13139 | 4.0% | |
| t | 12622 | 3.8% | |
| o | 12111 | 3.7% | |
| C | 10452 | 3.2% | |
| m | 9090 | 2.8% | |
| S | 6624 | 2.0% | |
| R | 6475 | 2.0% | |
| l | 6119 | 1.9% | |
| s | 6105 | 1.9% | |
| I | 5982 | 1.8% | |
| T | 5823 | 1.8% | |
| u | 5416 | 1.6% | |
| K | 5375 | 1.6% | |
| c | 4815 | 1.5% | |
| N | 4632 | 1.4% | |
| G | 4290 | 1.3% | |
| v | 4274 | 1.3% | |
| Other values (27) | 42576 | 13.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 12572 | 84.2% | ||
| 0 | 778 | 5.2% | |
| / | 669 | 4.5% | |
| - | 268 | 1.8% | |
| . | 236 | 1.6% | |
| _ | 169 | 1.1% | |
| ( | 157 | 1.1% | |
| & | 48 | 0.3% | |
| } | 13 | 0.1% | |
| ' | 11 | 0.1% | |
| $ | 2 | < 0.1% | |
| [ | 2 | < 0.1% | |
| ] | 2 | < 0.1% | |
| ) | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 9 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 343393 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| D | 26366 | 7.7% | |
| W | 24498 | 7.1% | |
| E | 24200 | 7.0% | |
| n | 23264 | 6.8% | |
| a | 20721 | 6.0% | |
| e | 15083 | 4.4% | |
| i | 14923 | 4.3% | |
| A | 13487 | 3.9% | |
| r | 13139 | 3.8% | |
| t | 12622 | 3.7% | |
| 12572 | 3.7% | ||
| o | 12111 | 3.5% | |
| C | 10452 | 3.0% | |
| m | 9090 | 2.6% | |
| S | 6624 | 1.9% | |
| R | 6475 | 1.9% | |
| l | 6119 | 1.8% | |
| s | 6105 | 1.8% | |
| I | 5982 | 1.7% | |
| T | 5823 | 1.7% | |
| u | 5416 | 1.6% | |
| K | 5375 | 1.6% | |
| c | 4815 | 1.4% | |
| N | 4632 | 1.3% | |
| G | 4290 | 1.2% | |
| Other values (44) | 49209 | 14.3% |
| Distinct count | 57515 |
|---|---|
| Unique (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.149669123888835 |
|---|---|
| Minimum | 29.6071219 |
| Maximum | 40.34519307 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 29.6071219 |
|---|---|
| 5-th percentile | 30.62360773 |
| Q1 | 33.28510016 |
| median | 35.00594322 |
| Q3 | 37.23371212 |
| 95-th percentile | 39.15049865 |
| Maximum | 40.34519307 |
| Range | 10.73807117 |
| Interquartile range (IQR) | 3.94861196 |
Descriptive statistics
| Standard deviation | 2.60742797 |
|---|---|
| Coefficient of variation (CV) | 0.07418072587 |
| Kurtosis | -0.8692761515 |
| Mean | 35.14966912 |
| Median Absolute Deviation (MAD) | 1.979294605 |
| Skewness | -0.1348112926 |
| Sum | 2024199.146 |
| Variance | 6.798680617 |
| Value | Count | Frequency (%) | |
| 33.09034738 | 2 | < 0.1% | |
| 39.08628657 | 2 | < 0.1% | |
| 39.09309544 | 2 | < 0.1% | |
| 39.09851362 | 2 | < 0.1% | |
| 37.54340145 | 2 | < 0.1% | |
| 32.98856004 | 2 | < 0.1% | |
| 32.95652279 | 2 | < 0.1% | |
| 32.98767048 | 2 | < 0.1% | |
| 32.96700926 | 2 | < 0.1% | |
| 32.99327684 | 2 | < 0.1% | |
| 39.08596496 | 2 | < 0.1% | |
| 37.53432734 | 2 | < 0.1% | |
| 31.61952953 | 2 | < 0.1% | |
| 39.09568416 | 2 | < 0.1% | |
| 39.08618257 | 2 | < 0.1% | |
| 37.25219446 | 2 | < 0.1% | |
| 32.96573445 | 2 | < 0.1% | |
| 37.37571687 | 2 | < 0.1% | |
| 37.31891128 | 2 | < 0.1% | |
| 37.37401655 | 2 | < 0.1% | |
| 32.98269806 | 2 | < 0.1% | |
| 37.54090064 | 2 | < 0.1% | |
| 39.08887513 | 2 | < 0.1% | |
| 38.34050134 | 2 | < 0.1% | |
| 39.11921037 | 2 | < 0.1% | |
| Other values (57490) | 57538 | 99.9% |
| Value | Count | Frequency (%) | |
| 29.6071219 | 1 | < 0.1% | |
| 29.60720109 | 1 | < 0.1% | |
| 29.61032056 | 1 | < 0.1% | |
| 29.61096482 | 1 | < 0.1% | |
| 29.61194674 | 1 | < 0.1% | |
| 29.61250689 | 1 | < 0.1% | |
| 29.61276296 | 1 | < 0.1% | |
| 29.61344309 | 1 | < 0.1% | |
| 29.6168718 | 1 | < 0.1% | |
| 29.61847919 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 40.34519307 | 1 | < 0.1% | |
| 40.34430089 | 1 | < 0.1% | |
| 40.32523996 | 1 | < 0.1% | |
| 40.32522643 | 1 | < 0.1% | |
| 40.32340181 | 1 | < 0.1% | |
| 40.32283237 | 1 | < 0.1% | |
| 40.32280453 | 1 | < 0.1% | |
| 40.3226251 | 1 | < 0.1% | |
| 40.32216902 | 1 | < 0.1% | |
| 40.32196593 | 1 | < 0.1% |
| Distinct count | 57516 |
|---|---|
| Unique (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.885572340514864 |
|---|---|
| Minimum | -11.64944018 |
| Maximum | -0.99846435 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | -11.64944018 |
|---|---|
| 5-th percentile | -10.60147827 |
| Q1 | -8.643840785 |
| median | -5.17270373 |
| Q3 | -3.372824195 |
| 95-th percentile | -1.802689797 |
| Maximum | -0.99846435 |
| Range | 10.65097583 |
| Interquartile range (IQR) | 5.27101659 |
Descriptive statistics
| Standard deviation | 2.809876457 |
|---|---|
| Coefficient of variation (CV) | -0.477417708 |
| Kurtosis | -1.203165882 |
| Mean | -5.885572341 |
| Median Absolute Deviation (MAD) | 2.041399535 |
| Skewness | -0.2522877584 |
| Sum | -338938.3399 |
| Variance | 7.895405705 |
| Value | Count | Frequency (%) | |
| -6.97627011 | 2 | < 0.1% | |
| -6.98584173 | 2 | < 0.1% | |
| -7.05692253 | 2 | < 0.1% | |
| -6.9787555 | 2 | < 0.1% | |
| -6.95974873 | 2 | < 0.1% | |
| -6.96355665 | 2 | < 0.1% | |
| -2.46390984 | 2 | < 0.1% | |
| -7.10374232 | 2 | < 0.1% | |
| -6.98318263 | 2 | < 0.1% | |
| -2.51995041 | 2 | < 0.1% | |
| -2.52871573 | 2 | < 0.1% | |
| -6.9802204 | 2 | < 0.1% | |
| -6.98945622 | 2 | < 0.1% | |
| -2.50658954 | 2 | < 0.1% | |
| -6.95674564 | 2 | < 0.1% | |
| -7.10462503 | 2 | < 0.1% | |
| -2.51661939 | 2 | < 0.1% | |
| -2.49454559 | 2 | < 0.1% | |
| -6.96247516 | 2 | < 0.1% | |
| -6.98311512 | 2 | < 0.1% | |
| -2.49645868 | 2 | < 0.1% | |
| -9.2893492 | 2 | < 0.1% | |
| -2.51532072 | 2 | < 0.1% | |
| -6.99054864 | 2 | < 0.1% | |
| -6.9642576 | 2 | < 0.1% | |
| Other values (57491) | 57538 | 99.9% |
| Value | Count | Frequency (%) | |
| -11.64944018 | 1 | < 0.1% | |
| -11.64837759 | 1 | < 0.1% | |
| -11.58629656 | 1 | < 0.1% | |
| -11.56857679 | 1 | < 0.1% | |
| -11.56680457 | 1 | < 0.1% | |
| -11.56450865 | 1 | < 0.1% | |
| -11.56432357 | 1 | < 0.1% | |
| -11.56231592 | 1 | < 0.1% | |
| -11.56228898 | 1 | < 0.1% | |
| -11.56161898 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| -0.99846435 | 1 | < 0.1% | |
| -0.998916 | 1 | < 0.1% | |
| -0.99901209 | 1 | < 0.1% | |
| -0.99911702 | 1 | < 0.1% | |
| -0.9994692 | 1 | < 0.1% | |
| -0.99950651 | 1 | < 0.1% | |
| -0.99952232 | 1 | < 0.1% | |
| -1.00058519 | 1 | < 0.1% | |
| -1.0015208 | 1 | < 0.1% | |
| -1.00198784 | 1 | < 0.1% |
| Distinct count | 36720 |
|---|---|
| Unique (%) | 63.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| none | 3492 |
|---|---|
| Shuleni | 1734 |
| Zahanati | 814 |
| Msikitini | 533 |
| Kanisani | 322 |
| Other values (36715) |
| Value | Count | Frequency (%) | |
| none | 3492 | 6.1% | |
| Shuleni | 1734 | 3.0% | |
| Zahanati | 814 | 1.4% | |
| Msikitini | 533 | 0.9% | |
| Kanisani | 322 | 0.6% | |
| Sokoni | 256 | 0.4% | |
| Ofisini | 245 | 0.4% | |
| Shule Ya Msingi | 199 | 0.3% | |
| School | 197 | 0.3% | |
| Bombani | 155 | 0.3% | |
| Shule | 152 | 0.3% | |
| Sekondari | 145 | 0.3% | |
| Madukani | 101 | 0.2% | |
| Hospital | 86 | 0.1% | |
| Mkombozi | 84 | 0.1% | |
| Mbugani | 84 | 0.1% | |
| Kituo Cha Afya | 80 | 0.1% | |
| Kisimani | 78 | 0.1% | |
| Mkuyuni | 77 | 0.1% | |
| Ccm | 76 | 0.1% | |
| Ofisi Ya Kijiji | 76 | 0.1% | |
| Muungano | 76 | 0.1% | |
| Center | 73 | 0.1% | |
| Tankini | 73 | 0.1% | |
| Bwawani | 65 | 0.1% | |
| Other values (36695) | 48315 | 83.9% |
Length
| Max length | 30 |
|---|---|
| Median length | 10 |
| Mean length | 11.03274988 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 96297 | 15.2% | |
| i | 51038 | 8.0% | |
| 49376 | 7.8% | ||
| n | 40878 | 6.4% | |
| e | 40112 | 6.3% | |
| w | 31301 | 4.9% | |
| K | 31086 | 4.9% | |
| o | 29261 | 4.6% | |
| u | 23433 | 3.7% | |
| M | 21475 | 3.4% | |
| l | 20400 | 3.2% | |
| m | 16930 | 2.7% | |
| h | 16855 | 2.7% | |
| s | 16418 | 2.6% | |
| r | 13849 | 2.2% | |
| g | 12575 | 2.0% | |
| t | 11298 | 1.8% | |
| k | 10743 | 1.7% | |
| S | 10519 | 1.7% | |
| d | 10089 | 1.6% | |
| b | 10061 | 1.6% | |
| y | 7562 | 1.2% | |
| z | 6162 | 1.0% | |
| c | 4940 | 0.8% | |
| N | 4710 | 0.7% | |
| Other values (50) | 47986 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 480495 | 75.6% | |
| Uppercase Letter | 102885 | 16.2% | |
| Space Separator | 49376 | 7.8% | |
| Decimal Number | 1677 | 0.3% | |
| Other Punctuation | 701 | 0.1% | |
| Dash Punctuation | 103 | < 0.1% | |
| Open Punctuation | 37 | < 0.1% | |
| Close Punctuation | 37 | < 0.1% | |
| Connector Punctuation | 24 | < 0.1% | |
| Modifier Symbol | 19 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 96297 | 20.0% | |
| i | 51038 | 10.6% | |
| n | 40878 | 8.5% | |
| e | 40112 | 8.3% | |
| w | 31301 | 6.5% | |
| o | 29261 | 6.1% | |
| u | 23433 | 4.9% | |
| l | 20400 | 4.2% | |
| m | 16930 | 3.5% | |
| h | 16855 | 3.5% | |
| s | 16418 | 3.4% | |
| r | 13849 | 2.9% | |
| g | 12575 | 2.6% | |
| t | 11298 | 2.4% | |
| k | 10743 | 2.2% | |
| d | 10089 | 2.1% | |
| b | 10061 | 2.1% | |
| y | 7562 | 1.6% | |
| z | 6162 | 1.3% | |
| c | 4940 | 1.0% | |
| p | 3470 | 0.7% | |
| j | 3338 | 0.7% | |
| f | 2255 | 0.5% | |
| v | 1030 | 0.2% | |
| x | 126 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| K | 31086 | 30.2% | |
| M | 21475 | 20.9% | |
| S | 10519 | 10.2% | |
| N | 4710 | 4.6% | |
| A | 3430 | 3.3% | |
| B | 3222 | 3.1% | |
| C | 2717 | 2.6% | |
| P | 2507 | 2.4% | |
| L | 2475 | 2.4% | |
| J | 2320 | 2.3% | |
| Y | 1948 | 1.9% | |
| T | 1888 | 1.8% | |
| I | 1744 | 1.7% | |
| R | 1617 | 1.6% | |
| H | 1573 | 1.5% | |
| Z | 1498 | 1.5% | |
| D | 1400 | 1.4% | |
| G | 1299 | 1.3% | |
| O | 1215 | 1.2% | |
| E | 1191 | 1.2% | |
| U | 932 | 0.9% | |
| W | 862 | 0.8% | |
| F | 810 | 0.8% | |
| V | 387 | 0.4% | |
| Q | 53 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 49376 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| ' | 396 | 56.5% | |
| . | 174 | 24.8% | |
| / | 128 | 18.3% | |
| & | 2 | 0.3% | |
| \ | 1 | 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 103 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 507 | 30.2% | |
| 2 | 439 | 26.2% | |
| 3 | 151 | 9.0% | |
| 4 | 119 | 7.1% | |
| 7 | 106 | 6.3% | |
| 5 | 86 | 5.1% | |
| 6 | 79 | 4.7% | |
| 8 | 75 | 4.5% | |
| 9 | 70 | 4.2% | |
| 0 | 45 | 2.7% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 29 | 78.4% | |
| [ | 8 | 21.6% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 29 | 78.4% | |
| ] | 8 | 21.6% |
Most frequent Connector Punctuation characters
| Value | Count | Frequency (%) | |
| _ | 24 | 100.0% |
Most frequent Modifier Symbol characters
| Value | Count | Frequency (%) | |
| ` | 19 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 583380 | 91.8% | |
| Common | 51974 | 8.2% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 96297 | 16.5% | |
| i | 51038 | 8.7% | |
| n | 40878 | 7.0% | |
| e | 40112 | 6.9% | |
| w | 31301 | 5.4% | |
| K | 31086 | 5.3% | |
| o | 29261 | 5.0% | |
| u | 23433 | 4.0% | |
| M | 21475 | 3.7% | |
| l | 20400 | 3.5% | |
| m | 16930 | 2.9% | |
| h | 16855 | 2.9% | |
| s | 16418 | 2.8% | |
| r | 13849 | 2.4% | |
| g | 12575 | 2.2% | |
| t | 11298 | 1.9% | |
| k | 10743 | 1.8% | |
| S | 10519 | 1.8% | |
| d | 10089 | 1.7% | |
| b | 10061 | 1.7% | |
| y | 7562 | 1.3% | |
| z | 6162 | 1.1% | |
| c | 4940 | 0.8% | |
| N | 4710 | 0.8% | |
| p | 3470 | 0.6% | |
| Other values (27) | 41918 | 7.2% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 49376 | 95.0% | ||
| 1 | 507 | 1.0% | |
| 2 | 439 | 0.8% | |
| ' | 396 | 0.8% | |
| . | 174 | 0.3% | |
| 3 | 151 | 0.3% | |
| / | 128 | 0.2% | |
| 4 | 119 | 0.2% | |
| 7 | 106 | 0.2% | |
| - | 103 | 0.2% | |
| 5 | 86 | 0.2% | |
| 6 | 79 | 0.2% | |
| 8 | 75 | 0.1% | |
| 9 | 70 | 0.1% | |
| 0 | 45 | 0.1% | |
| ( | 29 | 0.1% | |
| ) | 29 | 0.1% | |
| _ | 24 | < 0.1% | |
| ` | 19 | < 0.1% | |
| [ | 8 | < 0.1% | |
| ] | 8 | < 0.1% | |
| & | 2 | < 0.1% | |
| \ | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 635354 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 96297 | 15.2% | |
| i | 51038 | 8.0% | |
| 49376 | 7.8% | ||
| n | 40878 | 6.4% | |
| e | 40112 | 6.3% | |
| w | 31301 | 4.9% | |
| K | 31086 | 4.9% | |
| o | 29261 | 4.6% | |
| u | 23433 | 3.7% | |
| M | 21475 | 3.4% | |
| l | 20400 | 3.2% | |
| m | 16930 | 2.7% | |
| h | 16855 | 2.7% | |
| s | 16418 | 2.6% | |
| r | 13849 | 2.2% | |
| g | 12575 | 2.0% | |
| t | 11298 | 1.8% | |
| k | 10743 | 1.7% | |
| S | 10519 | 1.7% | |
| d | 10089 | 1.6% | |
| b | 10061 | 1.6% | |
| y | 7562 | 1.2% | |
| z | 6162 | 1.0% | |
| c | 4940 | 0.8% | |
| N | 4710 | 0.7% | |
| Other values (50) | 47986 | 7.6% |
| Distinct count | 65 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.48906022087934986 |
|---|---|
| Minimum | 0 |
| Maximum | 1776 |
| Zeros | 56831 |
| Zeros (%) | 98.7% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 1776 |
| Range | 1776 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 12.42695441 |
|---|---|
| Coefficient of variation (CV) | 25.40986546 |
| Kurtosis | 10798.07441 |
| Mean | 0.4890602209 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 90.52355548 |
| Sum | 28164 |
| Variance | 154.429196 |
| Value | Count | Frequency (%) | |
| 0 | 56831 | 98.7% | |
| 6 | 81 | 0.1% | |
| 1 | 73 | 0.1% | |
| 5 | 46 | 0.1% | |
| 8 | 46 | 0.1% | |
| 32 | 40 | 0.1% | |
| 45 | 36 | 0.1% | |
| 15 | 35 | 0.1% | |
| 39 | 30 | 0.1% | |
| 93 | 28 | < 0.1% | |
| 3 | 27 | < 0.1% | |
| 7 | 26 | < 0.1% | |
| 2 | 23 | < 0.1% | |
| 65 | 22 | < 0.1% | |
| 47 | 21 | < 0.1% | |
| 102 | 20 | < 0.1% | |
| 4 | 20 | < 0.1% | |
| 17 | 17 | < 0.1% | |
| 80 | 15 | < 0.1% | |
| 20 | 14 | < 0.1% | |
| 25 | 12 | < 0.1% | |
| 11 | 11 | < 0.1% | |
| 41 | 10 | < 0.1% | |
| 34 | 10 | < 0.1% | |
| 16 | 8 | < 0.1% | |
| Other values (40) | 86 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 56831 | 98.7% | |
| 1 | 73 | 0.1% | |
| 2 | 23 | < 0.1% | |
| 3 | 27 | < 0.1% | |
| 4 | 20 | < 0.1% | |
| 5 | 46 | 0.1% | |
| 6 | 81 | 0.1% | |
| 7 | 26 | < 0.1% | |
| 8 | 46 | 0.1% | |
| 9 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1776 | 1 | < 0.1% | |
| 1402 | 1 | < 0.1% | |
| 755 | 1 | < 0.1% | |
| 698 | 1 | < 0.1% | |
| 672 | 1 | < 0.1% | |
| 668 | 1 | < 0.1% | |
| 450 | 1 | < 0.1% | |
| 300 | 1 | < 0.1% | |
| 280 | 1 | < 0.1% | |
| 240 | 1 | < 0.1% |
basin
Categorical
| Distinct count | 9 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| Pangani | |
|---|---|
| Lake Victoria | |
| Rufiji | |
| Internal | |
| Lake Tanganyika | |
| Other values (4) |
| Value | Count | Frequency (%) | |
| Pangani | 8940 | 15.5% | |
| Lake Victoria | 8535 | 14.8% | |
| Rufiji | 7976 | 13.9% | |
| Internal | 7785 | 13.5% | |
| Lake Tanganyika | 6333 | 11.0% | |
| Wami / Ruvu | 5987 | 10.4% | |
| Lake Nyasa | 5085 | 8.8% | |
| Ruvuma / Southern Coast | 4493 | 7.8% | |
| Lake Rukwa | 2454 | 4.3% |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 10.82260193 |
| Min length | 6 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 103203 | 16.6% | |
| i | 54282 | 8.7% | |
| n | 50609 | 8.1% | |
| 47860 | 7.7% | ||
| u | 35883 | 5.8% | |
| e | 34685 | 5.6% | |
| k | 31194 | 5.0% | |
| t | 25306 | 4.1% | |
| L | 22407 | 3.6% | |
| R | 20910 | 3.4% | |
| r | 20813 | 3.3% | |
| o | 17521 | 2.8% | |
| g | 15273 | 2.5% | |
| y | 11418 | 1.8% | |
| v | 10480 | 1.7% | |
| m | 10480 | 1.7% | |
| / | 10480 | 1.7% | |
| s | 9578 | 1.5% | |
| P | 8940 | 1.4% | |
| V | 8535 | 1.4% | |
| c | 8535 | 1.4% | |
| f | 7976 | 1.3% | |
| j | 7976 | 1.3% | |
| I | 7785 | 1.2% | |
| l | 7785 | 1.2% | |
| Other values (7) | 33338 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 469944 | 75.4% | |
| Uppercase Letter | 94968 | 15.2% | |
| Space Separator | 47860 | 7.7% | |
| Other Punctuation | 10480 | 1.7% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| L | 22407 | 23.6% | |
| R | 20910 | 22.0% | |
| P | 8940 | 9.4% | |
| V | 8535 | 9.0% | |
| I | 7785 | 8.2% | |
| T | 6333 | 6.7% | |
| W | 5987 | 6.3% | |
| N | 5085 | 5.4% | |
| S | 4493 | 4.7% | |
| C | 4493 | 4.7% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 103203 | 22.0% | |
| i | 54282 | 11.6% | |
| n | 50609 | 10.8% | |
| u | 35883 | 7.6% | |
| e | 34685 | 7.4% | |
| k | 31194 | 6.6% | |
| t | 25306 | 5.4% | |
| r | 20813 | 4.4% | |
| o | 17521 | 3.7% | |
| g | 15273 | 3.2% | |
| y | 11418 | 2.4% | |
| v | 10480 | 2.2% | |
| m | 10480 | 2.2% | |
| s | 9578 | 2.0% | |
| c | 8535 | 1.8% | |
| f | 7976 | 1.7% | |
| j | 7976 | 1.7% | |
| l | 7785 | 1.7% | |
| h | 4493 | 1.0% | |
| w | 2454 | 0.5% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 47860 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 10480 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 564912 | 90.6% | |
| Common | 58340 | 9.4% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 103203 | 18.3% | |
| i | 54282 | 9.6% | |
| n | 50609 | 9.0% | |
| u | 35883 | 6.4% | |
| e | 34685 | 6.1% | |
| k | 31194 | 5.5% | |
| t | 25306 | 4.5% | |
| L | 22407 | 4.0% | |
| R | 20910 | 3.7% | |
| r | 20813 | 3.7% | |
| o | 17521 | 3.1% | |
| g | 15273 | 2.7% | |
| y | 11418 | 2.0% | |
| v | 10480 | 1.9% | |
| m | 10480 | 1.9% | |
| s | 9578 | 1.7% | |
| P | 8940 | 1.6% | |
| V | 8535 | 1.5% | |
| c | 8535 | 1.5% | |
| f | 7976 | 1.4% | |
| j | 7976 | 1.4% | |
| I | 7785 | 1.4% | |
| l | 7785 | 1.4% | |
| T | 6333 | 1.1% | |
| W | 5987 | 1.1% | |
| Other values (5) | 21018 | 3.7% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 47860 | 82.0% | ||
| / | 10480 | 18.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 623252 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 103203 | 16.6% | |
| i | 54282 | 8.7% | |
| n | 50609 | 8.1% | |
| 47860 | 7.7% | ||
| u | 35883 | 5.8% | |
| e | 34685 | 5.6% | |
| k | 31194 | 5.0% | |
| t | 25306 | 4.1% | |
| L | 22407 | 3.6% | |
| R | 20910 | 3.4% | |
| r | 20813 | 3.3% | |
| o | 17521 | 2.8% | |
| g | 15273 | 2.5% | |
| y | 11418 | 1.8% | |
| v | 10480 | 1.7% | |
| m | 10480 | 1.7% | |
| / | 10480 | 1.7% | |
| s | 9578 | 1.5% | |
| P | 8940 | 1.4% | |
| V | 8535 | 1.4% | |
| c | 8535 | 1.4% | |
| f | 7976 | 1.3% | |
| j | 7976 | 1.3% | |
| I | 7785 | 1.2% | |
| l | 7785 | 1.2% | |
| Other values (7) | 33338 | 5.3% |
| Distinct count | 18567 |
|---|---|
| Unique (%) | 32.5% |
| Missing | 371 |
| Missing (%) | 0.6% |
| Memory size | 450.0 KiB |
| Majengo | 494 |
|---|---|
| Shuleni | 492 |
| Madukani | 435 |
| Kati | 366 |
| Mtakuja | 257 |
| Other values (18562) |
| Value | Count | Frequency (%) | |
| Majengo | 494 | 0.9% | |
| Shuleni | 492 | 0.9% | |
| Madukani | 435 | 0.8% | |
| Kati | 366 | 0.6% | |
| Mtakuja | 257 | 0.4% | |
| Sokoni | 228 | 0.4% | |
| M | 187 | 0.3% | |
| Muungano | 170 | 0.3% | |
| Mbuyuni | 164 | 0.3% | |
| Mlimani | 147 | 0.3% | |
| Songambele | 135 | 0.2% | |
| Msikitini | 134 | 0.2% | |
| Miembeni | 134 | 0.2% | |
| 1 | 132 | 0.2% | |
| Kibaoni | 114 | 0.2% | |
| Kanisani | 110 | 0.2% | |
| I | 109 | 0.2% | |
| Mapinduzi | 109 | 0.2% | |
| Mjimwema | 108 | 0.2% | |
| Mjini | 104 | 0.2% | |
| Mkwajuni | 104 | 0.2% | |
| Mwenge | 101 | 0.2% | |
| Azimio | 98 | 0.2% | |
| Mabatini | 97 | 0.2% | |
| Bwawani | 91 | 0.2% | |
| Other values (18542) | 52597 | 91.3% | |
| (Missing) | 371 | 0.6% |
Length
| Max length | 30 |
|---|---|
| Median length | 7 |
| Mean length | 7.856029034 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 69680 | 15.4% | |
| i | 44504 | 9.8% | |
| n | 33307 | 7.4% | |
| u | 25461 | 5.6% | |
| e | 25006 | 5.5% | |
| o | 23022 | 5.1% | |
| M | 19736 | 4.4% | |
| g | 18345 | 4.1% | |
| l | 15655 | 3.5% | |
| m | 14491 | 3.2% | |
| K | 12386 | 2.7% | |
| 11517 | 2.5% | ||
| t | 11434 | 2.5% | |
| b | 11356 | 2.5% | |
| k | 10779 | 2.4% | |
| r | 9840 | 2.2% | |
| s | 9555 | 2.1% | |
| w | 9425 | 2.1% | |
| h | 9162 | 2.0% | |
| d | 7934 | 1.8% | |
| y | 6723 | 1.5% | |
| N | 5761 | 1.3% | |
| B | 4856 | 1.1% | |
| I | 4339 | 1.0% | |
| j | 4207 | 0.9% | |
| Other values (48) | 33932 | 7.5% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 370020 | 81.8% | |
| Uppercase Letter | 69126 | 15.3% | |
| Space Separator | 11517 | 2.5% | |
| Other Punctuation | 1078 | 0.2% | |
| Decimal Number | 578 | 0.1% | |
| Modifier Symbol | 45 | < 0.1% | |
| Dash Punctuation | 36 | < 0.1% | |
| Open Punctuation | 5 | < 0.1% | |
| Close Punctuation | 5 | < 0.1% | |
| Connector Punctuation | 3 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 19736 | 28.6% | |
| K | 12386 | 17.9% | |
| N | 5761 | 8.3% | |
| B | 4856 | 7.0% | |
| I | 4339 | 6.3% | |
| S | 3901 | 5.6% | |
| A | 2971 | 4.3% | |
| C | 2446 | 3.5% | |
| L | 2415 | 3.5% | |
| U | 1678 | 2.4% | |
| T | 1107 | 1.6% | |
| W | 1051 | 1.5% | |
| R | 901 | 1.3% | |
| O | 872 | 1.3% | |
| G | 855 | 1.2% | |
| J | 728 | 1.1% | |
| D | 622 | 0.9% | |
| P | 486 | 0.7% | |
| H | 457 | 0.7% | |
| E | 357 | 0.5% | |
| Z | 351 | 0.5% | |
| V | 333 | 0.5% | |
| Y | 276 | 0.4% | |
| F | 174 | 0.3% | |
| Q | 67 | 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 69680 | 18.8% | |
| i | 44504 | 12.0% | |
| n | 33307 | 9.0% | |
| u | 25461 | 6.9% | |
| e | 25006 | 6.8% | |
| o | 23022 | 6.2% | |
| g | 18345 | 5.0% | |
| l | 15655 | 4.2% | |
| m | 14491 | 3.9% | |
| t | 11434 | 3.1% | |
| b | 11356 | 3.1% | |
| k | 10779 | 2.9% | |
| r | 9840 | 2.7% | |
| s | 9555 | 2.6% | |
| w | 9425 | 2.5% | |
| h | 9162 | 2.5% | |
| d | 7934 | 2.1% | |
| y | 6723 | 1.8% | |
| j | 4207 | 1.1% | |
| z | 3606 | 1.0% | |
| p | 2768 | 0.7% | |
| c | 1560 | 0.4% | |
| f | 1089 | 0.3% | |
| v | 1045 | 0.3% | |
| q | 62 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 11517 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| ' | 945 | 87.7% | |
| / | 103 | 9.6% | |
| . | 28 | 2.6% | |
| # | 2 | 0.2% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 237 | 41.0% | |
| 2 | 70 | 12.1% | |
| 3 | 50 | 8.7% | |
| 4 | 44 | 7.6% | |
| 6 | 33 | 5.7% | |
| 8 | 32 | 5.5% | |
| 9 | 32 | 5.5% | |
| 0 | 30 | 5.2% | |
| 5 | 28 | 4.8% | |
| 7 | 22 | 3.8% |
Most frequent Modifier Symbol characters
| Value | Count | Frequency (%) | |
| ` | 45 | 100.0% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 4 | 80.0% | |
| [ | 1 | 20.0% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 4 | 80.0% | |
| ] | 1 | 20.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 36 | 100.0% |
Most frequent Connector Punctuation characters
| Value | Count | Frequency (%) | |
| _ | 3 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 439146 | 97.1% | |
| Common | 13267 | 2.9% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 69680 | 15.9% | |
| i | 44504 | 10.1% | |
| n | 33307 | 7.6% | |
| u | 25461 | 5.8% | |
| e | 25006 | 5.7% | |
| o | 23022 | 5.2% | |
| M | 19736 | 4.5% | |
| g | 18345 | 4.2% | |
| l | 15655 | 3.6% | |
| m | 14491 | 3.3% | |
| K | 12386 | 2.8% | |
| t | 11434 | 2.6% | |
| b | 11356 | 2.6% | |
| k | 10779 | 2.5% | |
| r | 9840 | 2.2% | |
| s | 9555 | 2.2% | |
| w | 9425 | 2.1% | |
| h | 9162 | 2.1% | |
| d | 7934 | 1.8% | |
| y | 6723 | 1.5% | |
| N | 5761 | 1.3% | |
| B | 4856 | 1.1% | |
| I | 4339 | 1.0% | |
| j | 4207 | 1.0% | |
| S | 3901 | 0.9% | |
| Other values (26) | 28281 | 6.4% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 11517 | 86.8% | ||
| ' | 945 | 7.1% | |
| 1 | 237 | 1.8% | |
| / | 103 | 0.8% | |
| 2 | 70 | 0.5% | |
| 3 | 50 | 0.4% | |
| ` | 45 | 0.3% | |
| 4 | 44 | 0.3% | |
| - | 36 | 0.3% | |
| 6 | 33 | 0.2% | |
| 8 | 32 | 0.2% | |
| 9 | 32 | 0.2% | |
| 0 | 30 | 0.2% | |
| 5 | 28 | 0.2% | |
| . | 28 | 0.2% | |
| 7 | 22 | 0.2% | |
| ( | 4 | < 0.1% | |
| ) | 4 | < 0.1% | |
| _ | 3 | < 0.1% | |
| # | 2 | < 0.1% | |
| [ | 1 | < 0.1% | |
| ] | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 452413 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 69680 | 15.4% | |
| i | 44504 | 9.8% | |
| n | 33307 | 7.4% | |
| u | 25461 | 5.6% | |
| e | 25006 | 5.5% | |
| o | 23022 | 5.1% | |
| M | 19736 | 4.4% | |
| g | 18345 | 4.1% | |
| l | 15655 | 3.5% | |
| m | 14491 | 3.2% | |
| K | 12386 | 2.7% | |
| 11517 | 2.5% | ||
| t | 11434 | 2.5% | |
| b | 11356 | 2.5% | |
| k | 10779 | 2.4% | |
| r | 9840 | 2.2% | |
| s | 9555 | 2.1% | |
| w | 9425 | 2.1% | |
| h | 9162 | 2.0% | |
| d | 7934 | 1.8% | |
| y | 6723 | 1.5% | |
| N | 5761 | 1.3% | |
| B | 4856 | 1.1% | |
| I | 4339 | 1.0% | |
| j | 4207 | 0.9% | |
| Other values (48) | 33932 | 7.5% |
region
Categorical
| Distinct count | 21 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| Iringa | 5294 |
|---|---|
| Mbeya | 4639 |
| Kilimanjaro | 4379 |
| Morogoro | 4006 |
| Shinyanga | 3977 |
| Other values (16) |
| Value | Count | Frequency (%) | |
| Iringa | 5294 | 9.2% | |
| Mbeya | 4639 | 8.1% | |
| Kilimanjaro | 4379 | 7.6% | |
| Morogoro | 4006 | 7.0% | |
| Shinyanga | 3977 | 6.9% | |
| Arusha | 3350 | 5.8% | |
| Kagera | 3316 | 5.8% | |
| Kigoma | 2816 | 4.9% | |
| Ruvuma | 2640 | 4.6% | |
| Pwani | 2635 | 4.6% | |
| Tanga | 2547 | 4.4% | |
| Mwanza | 2295 | 4.0% | |
| Dodoma | 2201 | 3.8% | |
| Singida | 2093 | 3.6% | |
| Mara | 1969 | 3.4% | |
| Tabora | 1959 | 3.4% | |
| Rukwa | 1808 | 3.1% | |
| Mtwara | 1730 | 3.0% | |
| Manyara | 1583 | 2.7% | |
| Lindi | 1546 | 2.7% | |
| Dar es Salaam | 805 | 1.4% |
Length
| Max length | 13 |
|---|---|
| Median length | 6 |
| Mean length | 6.591025908 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 79789 | 21.0% | |
| r | 32397 | 8.5% | |
| i | 30758 | 8.1% | |
| n | 30326 | 8.0% | |
| o | 29580 | 7.8% | |
| g | 24049 | 6.3% | |
| M | 16222 | 4.3% | |
| m | 12841 | 3.4% | |
| K | 10511 | 2.8% | |
| u | 10438 | 2.7% | |
| y | 10199 | 2.7% | |
| e | 8760 | 2.3% | |
| w | 8468 | 2.2% | |
| h | 7327 | 1.9% | |
| S | 6875 | 1.8% | |
| b | 6598 | 1.7% | |
| d | 5840 | 1.5% | |
| I | 5294 | 1.4% | |
| l | 5184 | 1.4% | |
| T | 4506 | 1.2% | |
| R | 4448 | 1.2% | |
| j | 4379 | 1.2% | |
| s | 4155 | 1.1% | |
| A | 3350 | 0.9% | |
| D | 3006 | 0.8% | |
| Other values (7) | 14264 | 3.8% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 319561 | 84.2% | |
| Uppercase Letter | 58393 | 15.4% | |
| Space Separator | 1610 | 0.4% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 16222 | 27.8% | |
| K | 10511 | 18.0% | |
| S | 6875 | 11.8% | |
| I | 5294 | 9.1% | |
| T | 4506 | 7.7% | |
| R | 4448 | 7.6% | |
| A | 3350 | 5.7% | |
| D | 3006 | 5.1% | |
| P | 2635 | 4.5% | |
| L | 1546 | 2.6% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 79789 | 25.0% | |
| r | 32397 | 10.1% | |
| i | 30758 | 9.6% | |
| n | 30326 | 9.5% | |
| o | 29580 | 9.3% | |
| g | 24049 | 7.5% | |
| m | 12841 | 4.0% | |
| u | 10438 | 3.3% | |
| y | 10199 | 3.2% | |
| e | 8760 | 2.7% | |
| w | 8468 | 2.6% | |
| h | 7327 | 2.3% | |
| b | 6598 | 2.1% | |
| d | 5840 | 1.8% | |
| l | 5184 | 1.6% | |
| j | 4379 | 1.4% | |
| s | 4155 | 1.3% | |
| v | 2640 | 0.8% | |
| z | 2295 | 0.7% | |
| k | 1808 | 0.6% | |
| t | 1730 | 0.5% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 1610 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 377954 | 99.6% | |
| Common | 1610 | 0.4% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 79789 | 21.1% | |
| r | 32397 | 8.6% | |
| i | 30758 | 8.1% | |
| n | 30326 | 8.0% | |
| o | 29580 | 7.8% | |
| g | 24049 | 6.4% | |
| M | 16222 | 4.3% | |
| m | 12841 | 3.4% | |
| K | 10511 | 2.8% | |
| u | 10438 | 2.8% | |
| y | 10199 | 2.7% | |
| e | 8760 | 2.3% | |
| w | 8468 | 2.2% | |
| h | 7327 | 1.9% | |
| S | 6875 | 1.8% | |
| b | 6598 | 1.7% | |
| d | 5840 | 1.5% | |
| I | 5294 | 1.4% | |
| l | 5184 | 1.4% | |
| T | 4506 | 1.2% | |
| R | 4448 | 1.2% | |
| j | 4379 | 1.2% | |
| s | 4155 | 1.1% | |
| A | 3350 | 0.9% | |
| D | 3006 | 0.8% | |
| Other values (6) | 12654 | 3.3% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 1610 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 379564 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 79789 | 21.0% | |
| r | 32397 | 8.5% | |
| i | 30758 | 8.1% | |
| n | 30326 | 8.0% | |
| o | 29580 | 7.8% | |
| g | 24049 | 6.3% | |
| M | 16222 | 4.3% | |
| m | 12841 | 3.4% | |
| K | 10511 | 2.8% | |
| u | 10438 | 2.7% | |
| y | 10199 | 2.7% | |
| e | 8760 | 2.3% | |
| w | 8468 | 2.2% | |
| h | 7327 | 1.9% | |
| S | 6875 | 1.8% | |
| b | 6598 | 1.7% | |
| d | 5840 | 1.5% | |
| I | 5294 | 1.4% | |
| l | 5184 | 1.4% | |
| T | 4506 | 1.2% | |
| R | 4448 | 1.2% | |
| j | 4379 | 1.2% | |
| s | 4155 | 1.1% | |
| A | 3350 | 0.9% | |
| D | 3006 | 0.8% | |
| Other values (7) | 14264 | 3.8% |
region_code
Real number (ℝ≥0)
| Distinct count | 27 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.217614780857122 |
|---|---|
| Minimum | 1 |
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 12 |
| Q3 | 17 |
| 95-th percentile | 60 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 17.85525395 |
|---|---|
| Coefficient of variation (CV) | 1.173328028 |
| Kurtosis | 9.958197847 |
| Mean | 15.21761478 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 3.141767441 |
| Sum | 876352 |
| Variance | 318.8100935 |
| Value | Count | Frequency (%) | |
| 11 | 5297 | 9.2% | |
| 12 | 4639 | 8.1% | |
| 3 | 4379 | 7.6% | |
| 5 | 4040 | 7.0% | |
| 17 | 3954 | 6.9% | |
| 18 | 3324 | 5.8% | |
| 2 | 3024 | 5.3% | |
| 16 | 2816 | 4.9% | |
| 10 | 2640 | 4.6% | |
| 4 | 2513 | 4.4% | |
| 19 | 2295 | 4.0% | |
| 1 | 2201 | 3.8% | |
| 13 | 2093 | 3.6% | |
| 14 | 1979 | 3.4% | |
| 20 | 1969 | 3.4% | |
| 15 | 1808 | 3.1% | |
| 6 | 1609 | 2.8% | |
| 21 | 1583 | 2.7% | |
| 80 | 1238 | 2.1% | |
| 60 | 1025 | 1.8% | |
| 90 | 917 | 1.6% | |
| 7 | 805 | 1.4% | |
| 99 | 423 | 0.7% | |
| 9 | 390 | 0.7% | |
| 24 | 326 | 0.6% | |
| Other values (2) | 301 | 0.5% |
| Value | Count | Frequency (%) | |
| 1 | 2201 | 3.8% | |
| 2 | 3024 | 5.3% | |
| 3 | 4379 | 7.6% | |
| 4 | 2513 | 4.4% | |
| 5 | 4040 | 7.0% | |
| 6 | 1609 | 2.8% | |
| 7 | 805 | 1.4% | |
| 8 | 300 | 0.5% | |
| 9 | 390 | 0.7% | |
| 10 | 2640 | 4.6% |
| Value | Count | Frequency (%) | |
| 99 | 423 | 0.7% | |
| 90 | 917 | 1.6% | |
| 80 | 1238 | 2.1% | |
| 60 | 1025 | 1.8% | |
| 40 | 1 | < 0.1% | |
| 24 | 326 | 0.6% | |
| 21 | 1583 | 2.7% | |
| 20 | 1969 | 3.4% | |
| 19 | 2295 | 4.0% | |
| 18 | 3324 | 5.8% |
district_code
Real number (ℝ≥0)
| Distinct count | 20 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.728311453775092 |
|---|---|
| Minimum | 0 |
| Maximum | 80 |
| Zeros | 23 |
| Zeros (%) | < 0.1% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 80 |
| Range | 80 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 9.7602542 |
|---|---|
| Coefficient of variation (CV) | 1.703862347 |
| Kurtosis | 15.65118912 |
| Mean | 5.728311454 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.901635064 |
| Sum | 329882 |
| Variance | 95.26256205 |
| Value | Count | Frequency (%) | |
| 1 | 11146 | 19.4% | |
| 2 | 10909 | 18.9% | |
| 3 | 9998 | 17.4% | |
| 4 | 8996 | 15.6% | |
| 5 | 4356 | 7.6% | |
| 6 | 3586 | 6.2% | |
| 7 | 3343 | 5.8% | |
| 8 | 1043 | 1.8% | |
| 30 | 995 | 1.7% | |
| 33 | 874 | 1.5% | |
| 53 | 745 | 1.3% | |
| 43 | 505 | 0.9% | |
| 13 | 391 | 0.7% | |
| 23 | 293 | 0.5% | |
| 63 | 195 | 0.3% | |
| 62 | 109 | 0.2% | |
| 60 | 63 | 0.1% | |
| 0 | 23 | < 0.1% | |
| 80 | 12 | < 0.1% | |
| 67 | 6 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 23 | < 0.1% | |
| 1 | 11146 | 19.4% | |
| 2 | 10909 | 18.9% | |
| 3 | 9998 | 17.4% | |
| 4 | 8996 | 15.6% | |
| 5 | 4356 | 7.6% | |
| 6 | 3586 | 6.2% | |
| 7 | 3343 | 5.8% | |
| 8 | 1043 | 1.8% | |
| 13 | 391 | 0.7% |
| Value | Count | Frequency (%) | |
| 80 | 12 | < 0.1% | |
| 67 | 6 | < 0.1% | |
| 63 | 195 | 0.3% | |
| 62 | 109 | 0.2% | |
| 60 | 63 | 0.1% | |
| 53 | 745 | 1.3% | |
| 43 | 505 | 0.9% | |
| 33 | 874 | 1.5% | |
| 30 | 995 | 1.7% | |
| 23 | 293 | 0.5% |
| Distinct count | 124 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| Njombe | 2503 |
|---|---|
| Arusha Rural | 1252 |
| Moshi Rural | 1251 |
| Rungwe | 1106 |
| Kilosa | 1094 |
| Other values (119) |
| Value | Count | Frequency (%) | |
| Njombe | 2503 | 4.3% | |
| Arusha Rural | 1252 | 2.2% | |
| Moshi Rural | 1251 | 2.2% | |
| Rungwe | 1106 | 1.9% | |
| Kilosa | 1094 | 1.9% | |
| Kasulu | 1047 | 1.8% | |
| Mbozi | 1034 | 1.8% | |
| Meru | 1009 | 1.8% | |
| Bagamoyo | 997 | 1.7% | |
| Singida Rural | 995 | 1.7% | |
| Kilombero | 959 | 1.7% | |
| Same | 877 | 1.5% | |
| Kibondo | 874 | 1.5% | |
| Kyela | 859 | 1.5% | |
| Kahama | 836 | 1.5% | |
| Kigoma Rural | 824 | 1.4% | |
| Maswa | 809 | 1.4% | |
| Karagwe | 771 | 1.3% | |
| Mbinga | 750 | 1.3% | |
| Iringa Rural | 728 | 1.3% | |
| Serengeti | 716 | 1.2% | |
| Namtumbo | 694 | 1.2% | |
| Lushoto | 694 | 1.2% | |
| Songea Rural | 693 | 1.2% | |
| Mpanda | 679 | 1.2% | |
| Other values (99) | 33537 | 58.2% |
Length
| Max length | 16 |
|---|---|
| Median length | 6 |
| Mean length | 7.463568799 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 67165 | 15.6% | |
| o | 30079 | 7.0% | |
| u | 28005 | 6.5% | |
| i | 26985 | 6.3% | |
| r | 25881 | 6.0% | |
| n | 22521 | 5.2% | |
| e | 22091 | 5.1% | |
| l | 19238 | 4.5% | |
| g | 18066 | 4.2% | |
| M | 15698 | 3.7% | |
| m | 15622 | 3.6% | |
| b | 15603 | 3.6% | |
| R | 12207 | 2.8% | |
| K | 11663 | 2.7% | |
| 11235 | 2.6% | ||
| w | 9820 | 2.3% | |
| s | 9747 | 2.3% | |
| h | 8464 | 2.0% | |
| d | 7405 | 1.7% | |
| S | 6261 | 1.5% | |
| N | 5760 | 1.3% | |
| t | 5208 | 1.2% | |
| y | 4763 | 1.1% | |
| B | 3834 | 0.9% | |
| k | 3721 | 0.9% | |
| Other values (15) | 22770 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 349754 | 81.4% | |
| Uppercase Letter | 68823 | 16.0% | |
| Space Separator | 11235 | 2.6% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 15698 | 22.8% | |
| R | 12207 | 17.7% | |
| K | 11663 | 16.9% | |
| S | 6261 | 9.1% | |
| N | 5760 | 8.4% | |
| B | 3834 | 5.6% | |
| U | 3410 | 5.0% | |
| I | 2480 | 3.6% | |
| L | 2131 | 3.1% | |
| T | 1367 | 2.0% | |
| A | 1315 | 1.9% | |
| H | 1153 | 1.7% | |
| C | 881 | 1.3% | |
| D | 358 | 0.5% | |
| P | 305 | 0.4% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 67165 | 19.2% | |
| o | 30079 | 8.6% | |
| u | 28005 | 8.0% | |
| i | 26985 | 7.7% | |
| r | 25881 | 7.4% | |
| n | 22521 | 6.4% | |
| e | 22091 | 6.3% | |
| l | 19238 | 5.5% | |
| g | 18066 | 5.2% | |
| m | 15622 | 4.5% | |
| b | 15603 | 4.5% | |
| w | 9820 | 2.8% | |
| s | 9747 | 2.8% | |
| h | 8464 | 2.4% | |
| d | 7405 | 2.1% | |
| t | 5208 | 1.5% | |
| y | 4763 | 1.4% | |
| k | 3721 | 1.1% | |
| j | 3496 | 1.0% | |
| z | 1943 | 0.6% | |
| p | 1854 | 0.5% | |
| f | 1106 | 0.3% | |
| v | 671 | 0.2% | |
| c | 300 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 11235 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 418577 | 97.4% | |
| Common | 11235 | 2.6% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 67165 | 16.0% | |
| o | 30079 | 7.2% | |
| u | 28005 | 6.7% | |
| i | 26985 | 6.4% | |
| r | 25881 | 6.2% | |
| n | 22521 | 5.4% | |
| e | 22091 | 5.3% | |
| l | 19238 | 4.6% | |
| g | 18066 | 4.3% | |
| M | 15698 | 3.8% | |
| m | 15622 | 3.7% | |
| b | 15603 | 3.7% | |
| R | 12207 | 2.9% | |
| K | 11663 | 2.8% | |
| w | 9820 | 2.3% | |
| s | 9747 | 2.3% | |
| h | 8464 | 2.0% | |
| d | 7405 | 1.8% | |
| S | 6261 | 1.5% | |
| N | 5760 | 1.4% | |
| t | 5208 | 1.2% | |
| y | 4763 | 1.1% | |
| B | 3834 | 0.9% | |
| k | 3721 | 0.9% | |
| j | 3496 | 0.8% | |
| Other values (14) | 19274 | 4.6% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 11235 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 429812 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 67165 | 15.6% | |
| o | 30079 | 7.0% | |
| u | 28005 | 6.5% | |
| i | 26985 | 6.3% | |
| r | 25881 | 6.0% | |
| n | 22521 | 5.2% | |
| e | 22091 | 5.1% | |
| l | 19238 | 4.5% | |
| g | 18066 | 4.2% | |
| M | 15698 | 3.7% | |
| m | 15622 | 3.6% | |
| b | 15603 | 3.6% | |
| R | 12207 | 2.8% | |
| K | 11663 | 2.7% | |
| 11235 | 2.6% | ||
| w | 9820 | 2.3% | |
| s | 9747 | 2.3% | |
| h | 8464 | 2.0% | |
| d | 7405 | 1.7% | |
| S | 6261 | 1.5% | |
| N | 5760 | 1.3% | |
| t | 5208 | 1.2% | |
| y | 4763 | 1.1% | |
| B | 3834 | 0.9% | |
| k | 3721 | 0.9% | |
| Other values (15) | 22770 | 5.3% |
| Distinct count | 2033 |
|---|---|
| Unique (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| Igosi | 307 |
|---|---|
| Imalinyi | 252 |
| Siha Kati | 232 |
| Mdandu | 231 |
| Nduruma | 217 |
| Other values (2028) |
| Value | Count | Frequency (%) | |
| Igosi | 307 | 0.5% | |
| Imalinyi | 252 | 0.4% | |
| Siha Kati | 232 | 0.4% | |
| Mdandu | 231 | 0.4% | |
| Nduruma | 217 | 0.4% | |
| Kitunda | 203 | 0.4% | |
| Mishamo | 203 | 0.4% | |
| Msindo | 201 | 0.3% | |
| Chalinze | 196 | 0.3% | |
| Maji ya Chai | 190 | 0.3% | |
| Usuka | 187 | 0.3% | |
| Ngarenanyuki | 172 | 0.3% | |
| Chanika | 171 | 0.3% | |
| Vikindu | 162 | 0.3% | |
| Mtwango | 153 | 0.3% | |
| Matola | 145 | 0.3% | |
| Zinga/Ikerege | 141 | 0.2% | |
| Wanging'ombe | 139 | 0.2% | |
| Maramba | 139 | 0.2% | |
| Itete | 137 | 0.2% | |
| Magomeni | 135 | 0.2% | |
| Kikatiti | 134 | 0.2% | |
| Ifakara | 134 | 0.2% | |
| Olkokola | 133 | 0.2% | |
| Maposeni | 130 | 0.2% | |
| Other values (2008) | 53144 | 92.3% |
Length
| Max length | 23 |
|---|---|
| Median length | 7 |
| Mean length | 7.500312565 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 66689 | 15.4% | |
| i | 39117 | 9.1% | |
| n | 28795 | 6.7% | |
| u | 26130 | 6.0% | |
| o | 25396 | 5.9% | |
| e | 23207 | 5.4% | |
| g | 20559 | 4.8% | |
| M | 18603 | 4.3% | |
| m | 15616 | 3.6% | |
| l | 14485 | 3.4% | |
| r | 12856 | 3.0% | |
| b | 12563 | 2.9% | |
| s | 11126 | 2.6% | |
| K | 10847 | 2.5% | |
| h | 10558 | 2.4% | |
| k | 10316 | 2.4% | |
| t | 9198 | 2.1% | |
| d | 8751 | 2.0% | |
| w | 8624 | 2.0% | |
| y | 6962 | 1.6% | |
| I | 6030 | 1.4% | |
| N | 5550 | 1.3% | |
| 5408 | 1.3% | ||
| z | 3548 | 0.8% | |
| S | 3157 | 0.7% | |
| Other values (29) | 27837 | 6.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 362710 | 84.0% | |
| Uppercase Letter | 62711 | 14.5% | |
| Space Separator | 5408 | 1.3% | |
| Other Punctuation | 1076 | 0.2% | |
| Dash Punctuation | 23 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 18603 | 29.7% | |
| K | 10847 | 17.3% | |
| I | 6030 | 9.6% | |
| N | 5550 | 8.9% | |
| S | 3157 | 5.0% | |
| L | 3059 | 4.9% | |
| U | 2913 | 4.6% | |
| B | 2902 | 4.6% | |
| C | 1996 | 3.2% | |
| R | 1692 | 2.7% | |
| T | 776 | 1.2% | |
| D | 717 | 1.1% | |
| O | 661 | 1.1% | |
| V | 634 | 1.0% | |
| P | 577 | 0.9% | |
| H | 551 | 0.9% | |
| W | 387 | 0.6% | |
| G | 352 | 0.6% | |
| Z | 304 | 0.5% | |
| E | 289 | 0.5% | |
| A | 260 | 0.4% | |
| J | 187 | 0.3% | |
| Y | 149 | 0.2% | |
| Q | 76 | 0.1% | |
| F | 42 | 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 66689 | 18.4% | |
| i | 39117 | 10.8% | |
| n | 28795 | 7.9% | |
| u | 26130 | 7.2% | |
| o | 25396 | 7.0% | |
| e | 23207 | 6.4% | |
| g | 20559 | 5.7% | |
| m | 15616 | 4.3% | |
| l | 14485 | 4.0% | |
| r | 12856 | 3.5% | |
| b | 12563 | 3.5% | |
| s | 11126 | 3.1% | |
| h | 10558 | 2.9% | |
| k | 10316 | 2.8% | |
| t | 9198 | 2.5% | |
| d | 8751 | 2.4% | |
| w | 8624 | 2.4% | |
| y | 6962 | 1.9% | |
| z | 3548 | 1.0% | |
| p | 2810 | 0.8% | |
| j | 2437 | 0.7% | |
| c | 1364 | 0.4% | |
| f | 810 | 0.2% | |
| v | 777 | 0.2% | |
| q | 16 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 5408 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| ' | 926 | 86.1% | |
| / | 150 | 13.9% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 23 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 425421 | 98.5% | |
| Common | 6507 | 1.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 66689 | 15.7% | |
| i | 39117 | 9.2% | |
| n | 28795 | 6.8% | |
| u | 26130 | 6.1% | |
| o | 25396 | 6.0% | |
| e | 23207 | 5.5% | |
| g | 20559 | 4.8% | |
| M | 18603 | 4.4% | |
| m | 15616 | 3.7% | |
| l | 14485 | 3.4% | |
| r | 12856 | 3.0% | |
| b | 12563 | 3.0% | |
| s | 11126 | 2.6% | |
| K | 10847 | 2.5% | |
| h | 10558 | 2.5% | |
| k | 10316 | 2.4% | |
| t | 9198 | 2.2% | |
| d | 8751 | 2.1% | |
| w | 8624 | 2.0% | |
| y | 6962 | 1.6% | |
| I | 6030 | 1.4% | |
| N | 5550 | 1.3% | |
| z | 3548 | 0.8% | |
| S | 3157 | 0.7% | |
| L | 3059 | 0.7% | |
| Other values (25) | 23679 | 5.6% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 5408 | 83.1% | ||
| ' | 926 | 14.2% | |
| / | 150 | 2.3% | |
| - | 23 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 431928 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 66689 | 15.4% | |
| i | 39117 | 9.1% | |
| n | 28795 | 6.7% | |
| u | 26130 | 6.0% | |
| o | 25396 | 5.9% | |
| e | 23207 | 5.4% | |
| g | 20559 | 4.8% | |
| M | 18603 | 4.3% | |
| m | 15616 | 3.6% | |
| l | 14485 | 3.4% | |
| r | 12856 | 3.0% | |
| b | 12563 | 2.9% | |
| s | 11126 | 2.6% | |
| K | 10847 | 2.5% | |
| h | 10558 | 2.4% | |
| k | 10316 | 2.4% | |
| t | 9198 | 2.1% | |
| d | 8751 | 2.0% | |
| w | 8624 | 2.0% | |
| y | 6962 | 1.6% | |
| I | 6030 | 1.4% | |
| N | 5550 | 1.3% | |
| 5408 | 1.3% | ||
| z | 3548 | 0.8% | |
| S | 3157 | 0.7% | |
| Other values (29) | 27837 | 6.4% |
| Distinct count | 1049 |
|---|---|
| Unique (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 185.57083072862403 |
|---|---|
| Minimum | 0 |
| Maximum | 30500 |
| Zeros | 19569 |
| Zeros (%) | 34.0% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 35 |
| Q3 | 230 |
| 95-th percentile | 700 |
| Maximum | 30500 |
| Range | 30500 |
| Interquartile range (IQR) | 230 |
Descriptive statistics
| Standard deviation | 477.7442395 |
|---|---|
| Coefficient of variation (CV) | 2.574457622 |
| Kurtosis | 392.9484634 |
| Mean | 185.5708307 |
| Median Absolute Deviation (MAD) | 35 |
| Skewness | 12.51841497 |
| Sum | 10686653 |
| Variance | 228239.5584 |
| Value | Count | Frequency (%) | |
| 0 | 19569 | 34.0% | |
| 1 | 7025 | 12.2% | |
| 200 | 1940 | 3.4% | |
| 150 | 1892 | 3.3% | |
| 250 | 1681 | 2.9% | |
| 300 | 1476 | 2.6% | |
| 100 | 1146 | 2.0% | |
| 50 | 1139 | 2.0% | |
| 500 | 1009 | 1.8% | |
| 350 | 986 | 1.7% | |
| 120 | 916 | 1.6% | |
| 400 | 775 | 1.3% | |
| 60 | 706 | 1.2% | |
| 30 | 626 | 1.1% | |
| 40 | 552 | 1.0% | |
| 80 | 533 | 0.9% | |
| 450 | 499 | 0.9% | |
| 20 | 462 | 0.8% | |
| 600 | 438 | 0.8% | |
| 230 | 388 | 0.7% | |
| 75 | 289 | 0.5% | |
| 1000 | 278 | 0.5% | |
| 800 | 269 | 0.5% | |
| 90 | 265 | 0.5% | |
| 130 | 264 | 0.5% | |
| Other values (1024) | 12465 | 21.6% |
| Value | Count | Frequency (%) | |
| 0 | 19569 | 34.0% | |
| 1 | 7025 | 12.2% | |
| 2 | 4 | < 0.1% | |
| 3 | 4 | < 0.1% | |
| 4 | 13 | < 0.1% | |
| 5 | 44 | 0.1% | |
| 6 | 19 | < 0.1% | |
| 7 | 3 | < 0.1% | |
| 8 | 23 | < 0.1% | |
| 9 | 11 | < 0.1% |
| Value | Count | Frequency (%) | |
| 30500 | 1 | < 0.1% | |
| 15300 | 1 | < 0.1% | |
| 11463 | 1 | < 0.1% | |
| 10000 | 3 | < 0.1% | |
| 9865 | 1 | < 0.1% | |
| 9500 | 1 | < 0.1% | |
| 9000 | 3 | < 0.1% | |
| 8848 | 1 | < 0.1% | |
| 8600 | 1 | < 0.1% | |
| 8500 | 1 | < 0.1% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 2976 |
| Missing (%) | 5.2% |
| Memory size | 450.0 KiB |
| True | |
|---|---|
| False | 4875 |
| (Missing) | 2976 |
| Value | Count | Frequency (%) | |
| True | 49737 | 86.4% | |
| False | 4875 | 8.5% | |
| (Missing) | 2976 | 5.2% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| GeoData Consultants Ltd |
|---|
| Value | Count | Frequency (%) | |
| GeoData Consultants Ltd | 57588 | 100.0% |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Most occurring characters
| Value | Count | Frequency (%) | |
| t | 230352 | 17.4% | |
| a | 172764 | 13.0% | |
| o | 115176 | 8.7% | |
| 115176 | 8.7% | ||
| n | 115176 | 8.7% | |
| s | 115176 | 8.7% | |
| G | 57588 | 4.3% | |
| e | 57588 | 4.3% | |
| D | 57588 | 4.3% | |
| C | 57588 | 4.3% | |
| u | 57588 | 4.3% | |
| l | 57588 | 4.3% | |
| L | 57588 | 4.3% | |
| d | 57588 | 4.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 978996 | 73.9% | |
| Uppercase Letter | 230352 | 17.4% | |
| Space Separator | 115176 | 8.7% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| G | 57588 | 25.0% | |
| D | 57588 | 25.0% | |
| C | 57588 | 25.0% | |
| L | 57588 | 25.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| t | 230352 | 23.5% | |
| a | 172764 | 17.6% | |
| o | 115176 | 11.8% | |
| n | 115176 | 11.8% | |
| s | 115176 | 11.8% | |
| e | 57588 | 5.9% | |
| u | 57588 | 5.9% | |
| l | 57588 | 5.9% | |
| d | 57588 | 5.9% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 115176 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 1209348 | 91.3% | |
| Common | 115176 | 8.7% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| t | 230352 | 19.0% | |
| a | 172764 | 14.3% | |
| o | 115176 | 9.5% | |
| n | 115176 | 9.5% | |
| s | 115176 | 9.5% | |
| G | 57588 | 4.8% | |
| e | 57588 | 4.8% | |
| D | 57588 | 4.8% | |
| C | 57588 | 4.8% | |
| u | 57588 | 4.8% | |
| l | 57588 | 4.8% | |
| L | 57588 | 4.8% | |
| d | 57588 | 4.8% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 115176 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 1324524 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| t | 230352 | 17.4% | |
| a | 172764 | 13.0% | |
| o | 115176 | 8.7% | |
| 115176 | 8.7% | ||
| n | 115176 | 8.7% | |
| s | 115176 | 8.7% | |
| G | 57588 | 4.3% | |
| e | 57588 | 4.3% | |
| D | 57588 | 4.3% | |
| C | 57588 | 4.3% | |
| u | 57588 | 4.3% | |
| l | 57588 | 4.3% | |
| L | 57588 | 4.3% | |
| d | 57588 | 4.3% |
| Distinct count | 12 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 3750 |
| Missing (%) | 6.5% |
| Memory size | 450.0 KiB |
| VWC | |
|---|---|
| WUG | 4249 |
| Water authority | 3151 |
| WUA | 2882 |
| Water Board | 2747 |
| Other values (7) | 4666 |
| Value | Count | Frequency (%) | |
| VWC | 36143 | 62.8% | |
| WUG | 4249 | 7.4% | |
| Water authority | 3151 | 5.5% | |
| WUA | 2882 | 5.0% | |
| Water Board | 2747 | 4.8% | |
| Parastatal | 1607 | 2.8% | |
| Private operator | 1063 | 1.8% | |
| Company | 1061 | 1.8% | |
| Other | 765 | 1.3% | |
| SWC | 97 | 0.2% | |
| Trust | 72 | 0.1% | |
| None | 1 | < 0.1% | |
| (Missing) | 3750 | 6.5% |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.576283253 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| W | 49269 | 18.7% | |
| C | 37301 | 14.2% | |
| V | 36143 | 13.7% | |
| a | 25161 | 9.5% | |
| t | 18377 | 7.0% | |
| r | 17429 | 6.6% | |
| o | 9086 | 3.4% | |
| e | 8790 | 3.3% | |
| n | 8562 | 3.2% | |
| U | 7131 | 2.7% | |
| 6961 | 2.6% | ||
| G | 4249 | 1.6% | |
| i | 4214 | 1.6% | |
| y | 4212 | 1.6% | |
| h | 3916 | 1.5% | |
| u | 3223 | 1.2% | |
| A | 2882 | 1.1% | |
| B | 2747 | 1.0% | |
| d | 2747 | 1.0% | |
| P | 2670 | 1.0% | |
| p | 2124 | 0.8% | |
| s | 1679 | 0.6% | |
| l | 1607 | 0.6% | |
| v | 1063 | 0.4% | |
| m | 1061 | 0.4% | |
| Other values (4) | 935 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Uppercase Letter | 143327 | 54.4% | |
| Lowercase Letter | 113251 | 43.0% | |
| Space Separator | 6961 | 2.6% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| W | 49269 | 34.4% | |
| C | 37301 | 26.0% | |
| V | 36143 | 25.2% | |
| U | 7131 | 5.0% | |
| G | 4249 | 3.0% | |
| A | 2882 | 2.0% | |
| B | 2747 | 1.9% | |
| P | 2670 | 1.9% | |
| O | 765 | 0.5% | |
| S | 97 | 0.1% | |
| T | 72 | 0.1% | |
| N | 1 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 25161 | 22.2% | |
| t | 18377 | 16.2% | |
| r | 17429 | 15.4% | |
| o | 9086 | 8.0% | |
| e | 8790 | 7.8% | |
| n | 8562 | 7.6% | |
| i | 4214 | 3.7% | |
| y | 4212 | 3.7% | |
| h | 3916 | 3.5% | |
| u | 3223 | 2.8% | |
| d | 2747 | 2.4% | |
| p | 2124 | 1.9% | |
| s | 1679 | 1.5% | |
| l | 1607 | 1.4% | |
| v | 1063 | 0.9% | |
| m | 1061 | 0.9% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 6961 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 256578 | 97.4% | |
| Common | 6961 | 2.6% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| W | 49269 | 19.2% | |
| C | 37301 | 14.5% | |
| V | 36143 | 14.1% | |
| a | 25161 | 9.8% | |
| t | 18377 | 7.2% | |
| r | 17429 | 6.8% | |
| o | 9086 | 3.5% | |
| e | 8790 | 3.4% | |
| n | 8562 | 3.3% | |
| U | 7131 | 2.8% | |
| G | 4249 | 1.7% | |
| i | 4214 | 1.6% | |
| y | 4212 | 1.6% | |
| h | 3916 | 1.5% | |
| u | 3223 | 1.3% | |
| A | 2882 | 1.1% | |
| B | 2747 | 1.1% | |
| d | 2747 | 1.1% | |
| P | 2670 | 1.0% | |
| p | 2124 | 0.8% | |
| s | 1679 | 0.7% | |
| l | 1607 | 0.6% | |
| v | 1063 | 0.4% | |
| m | 1061 | 0.4% | |
| O | 765 | 0.3% | |
| Other values (3) | 170 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 6961 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 263539 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| W | 49269 | 18.7% | |
| C | 37301 | 14.2% | |
| V | 36143 | 13.7% | |
| a | 25161 | 9.5% | |
| t | 18377 | 7.0% | |
| r | 17429 | 6.6% | |
| o | 9086 | 3.4% | |
| e | 8790 | 3.3% | |
| n | 8562 | 3.2% | |
| U | 7131 | 2.7% | |
| 6961 | 2.6% | ||
| G | 4249 | 1.6% | |
| i | 4214 | 1.6% | |
| y | 4212 | 1.6% | |
| h | 3916 | 1.5% | |
| u | 3223 | 1.2% | |
| A | 2882 | 1.1% | |
| B | 2747 | 1.0% | |
| d | 2747 | 1.0% | |
| P | 2670 | 1.0% | |
| p | 2124 | 0.8% | |
| s | 1679 | 0.6% | |
| l | 1607 | 0.6% | |
| v | 1063 | 0.4% | |
| m | 1061 | 0.4% | |
| Other values (4) | 935 | 0.4% |
| Distinct count | 2658 |
|---|---|
| Unique (%) | 8.6% |
| Missing | 26692 |
| Missing (%) | 46.3% |
| Memory size | 450.0 KiB |
| K | 682 |
|---|---|
| None | 644 |
| Borehole | 418 |
| Chalinze wate | 405 |
| M | 400 |
| Other values (2653) |
| Value | Count | Frequency (%) | |
| K | 682 | 1.2% | |
| None | 644 | 1.1% | |
| Borehole | 418 | 0.7% | |
| Chalinze wate | 405 | 0.7% | |
| M | 400 | 0.7% | |
| DANIDA | 379 | 0.7% | |
| Government | 320 | 0.6% | |
| Ngana water supplied scheme | 270 | 0.5% | |
| wanging'ombe water supply s | 261 | 0.5% | |
| wanging'ombe supply scheme | 234 | 0.4% | |
| I | 229 | 0.4% | |
| Bagamoyo wate | 229 | 0.4% | |
| Uroki-Bomang'ombe water sup | 209 | 0.4% | |
| N | 204 | 0.4% | |
| Kirua kahe gravity water supply trust | 193 | 0.3% | |
| Machumba estate pipe line | 185 | 0.3% | |
| Makwale water supplied sche | 166 | 0.3% | |
| Kijiji | 161 | 0.3% | |
| S | 154 | 0.3% | |
| mtwango water supply scheme | 152 | 0.3% | |
| Handeni Trunk Main(H | 152 | 0.3% | |
| Losaa-Kia water supply | 152 | 0.3% | |
| Mkongoro Two | 147 | 0.3% | |
| Roman | 139 | 0.2% | |
| Mkongoro One | 128 | 0.2% | |
| Other values (2633) | 24283 | 42.2% | |
| (Missing) | 26692 | 46.3% |
Length
| Max length | 46 |
|---|---|
| Median length | 3 |
| Mean length | 9.090383413 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 74781 | 14.3% | |
| n | 71076 | 13.6% | |
| 41173 | 7.9% | ||
| e | 34849 | 6.7% | |
| i | 26379 | 5.0% | |
| p | 22391 | 4.3% | |
| r | 21629 | 4.1% | |
| t | 19113 | 3.7% | |
| u | 18262 | 3.5% | |
| o | 17111 | 3.3% | |
| l | 17014 | 3.3% | |
| s | 16401 | 3.1% | |
| w | 16318 | 3.1% | |
| m | 14008 | 2.7% | |
| y | 12023 | 2.3% | |
| g | 11270 | 2.2% | |
| M | 9311 | 1.8% | |
| h | 7882 | 1.5% | |
| K | 5528 | 1.1% | |
| d | 5527 | 1.1% | |
| k | 5310 | 1.0% | |
| b | 5106 | 1.0% | |
| c | 4978 | 1.0% | |
| N | 4318 | 0.8% | |
| S | 3737 | 0.7% | |
| Other values (41) | 38002 | 7.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 430386 | 82.2% | |
| Uppercase Letter | 49658 | 9.5% | |
| Space Separator | 41173 | 7.9% | |
| Other Punctuation | 1301 | 0.2% | |
| Dash Punctuation | 554 | 0.1% | |
| Open Punctuation | 191 | < 0.1% | |
| Decimal Number | 133 | < 0.1% | |
| Modifier Symbol | 70 | < 0.1% | |
| Close Punctuation | 31 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 9311 | 18.8% | |
| K | 5528 | 11.1% | |
| N | 4318 | 8.7% | |
| S | 3737 | 7.5% | |
| A | 2729 | 5.5% | |
| I | 2688 | 5.4% | |
| W | 2501 | 5.0% | |
| B | 2259 | 4.5% | |
| L | 2106 | 4.2% | |
| U | 1790 | 3.6% | |
| D | 1576 | 3.2% | |
| T | 1543 | 3.1% | |
| C | 1526 | 3.1% | |
| R | 1407 | 2.8% | |
| E | 1336 | 2.7% | |
| P | 1047 | 2.1% | |
| H | 1023 | 2.1% | |
| O | 955 | 1.9% | |
| G | 899 | 1.8% | |
| J | 385 | 0.8% | |
| V | 369 | 0.7% | |
| Y | 268 | 0.5% | |
| F | 224 | 0.5% | |
| Z | 91 | 0.2% | |
| Q | 42 | 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 74781 | 17.4% | |
| n | 71076 | 16.5% | |
| e | 34849 | 8.1% | |
| i | 26379 | 6.1% | |
| p | 22391 | 5.2% | |
| r | 21629 | 5.0% | |
| t | 19113 | 4.4% | |
| u | 18262 | 4.2% | |
| o | 17111 | 4.0% | |
| l | 17014 | 4.0% | |
| s | 16401 | 3.8% | |
| w | 16318 | 3.8% | |
| m | 14008 | 3.3% | |
| y | 12023 | 2.8% | |
| g | 11270 | 2.6% | |
| h | 7882 | 1.8% | |
| d | 5527 | 1.3% | |
| k | 5310 | 1.2% | |
| b | 5106 | 1.2% | |
| c | 4978 | 1.2% | |
| v | 3255 | 0.8% | |
| j | 3062 | 0.7% | |
| z | 1646 | 0.4% | |
| f | 955 | 0.2% | |
| q | 36 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 41173 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| ' | 922 | 70.9% | |
| / | 370 | 28.4% | |
| & | 8 | 0.6% | |
| : | 1 | 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 554 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 2 | 61 | 45.9% | |
| 3 | 55 | 41.4% | |
| 7 | 7 | 5.3% | |
| 5 | 4 | 3.0% | |
| 0 | 3 | 2.3% | |
| 6 | 3 | 2.3% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 191 | 100.0% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 31 | 100.0% |
Most frequent Modifier Symbol characters
| Value | Count | Frequency (%) | |
| ` | 70 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 480044 | 91.7% | |
| Common | 43453 | 8.3% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 74781 | 15.6% | |
| n | 71076 | 14.8% | |
| e | 34849 | 7.3% | |
| i | 26379 | 5.5% | |
| p | 22391 | 4.7% | |
| r | 21629 | 4.5% | |
| t | 19113 | 4.0% | |
| u | 18262 | 3.8% | |
| o | 17111 | 3.6% | |
| l | 17014 | 3.5% | |
| s | 16401 | 3.4% | |
| w | 16318 | 3.4% | |
| m | 14008 | 2.9% | |
| y | 12023 | 2.5% | |
| g | 11270 | 2.3% | |
| M | 9311 | 1.9% | |
| h | 7882 | 1.6% | |
| K | 5528 | 1.2% | |
| d | 5527 | 1.2% | |
| k | 5310 | 1.1% | |
| b | 5106 | 1.1% | |
| c | 4978 | 1.0% | |
| N | 4318 | 0.9% | |
| S | 3737 | 0.8% | |
| v | 3255 | 0.7% | |
| Other values (26) | 32467 | 6.8% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 41173 | 94.8% | ||
| ' | 922 | 2.1% | |
| - | 554 | 1.3% | |
| / | 370 | 0.9% | |
| ( | 191 | 0.4% | |
| ` | 70 | 0.2% | |
| 2 | 61 | 0.1% | |
| 3 | 55 | 0.1% | |
| ) | 31 | 0.1% | |
| & | 8 | < 0.1% | |
| 7 | 7 | < 0.1% | |
| 5 | 4 | < 0.1% | |
| 0 | 3 | < 0.1% | |
| 6 | 3 | < 0.1% | |
| : | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 523497 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 74781 | 14.3% | |
| n | 71076 | 13.6% | |
| 41173 | 7.9% | ||
| e | 34849 | 6.7% | |
| i | 26379 | 5.0% | |
| p | 22391 | 4.3% | |
| r | 21629 | 4.1% | |
| t | 19113 | 3.7% | |
| u | 18262 | 3.5% | |
| o | 17111 | 3.3% | |
| l | 17014 | 3.3% | |
| s | 16401 | 3.1% | |
| w | 16318 | 3.1% | |
| m | 14008 | 2.7% | |
| y | 12023 | 2.3% | |
| g | 11270 | 2.2% | |
| M | 9311 | 1.8% | |
| h | 7882 | 1.5% | |
| K | 5528 | 1.1% | |
| d | 5527 | 1.1% | |
| k | 5310 | 1.0% | |
| b | 5106 | 1.0% | |
| c | 4978 | 1.0% | |
| N | 4318 | 0.8% | |
| S | 3737 | 0.7% | |
| Other values (41) | 38002 | 7.3% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 3056 |
| Missing (%) | 5.3% |
| Memory size | 450.0 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 3056 |
| Value | Count | Frequency (%) | |
| True | 38100 | 66.2% | |
| False | 16432 | 28.5% | |
| (Missing) | 3056 | 5.3% |
| Distinct count | 55 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1341.577359866639 |
|---|---|
| Minimum | 0 |
| Maximum | 2013 |
| Zeros | 18897 |
| Zeros (%) | 32.8% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1988 |
| Q3 | 2004 |
| 95-th percentile | 2010 |
| Maximum | 2013 |
| Range | 2013 |
| Interquartile range (IQR) | 2004 |
Descriptive statistics
| Standard deviation | 937.6413676 |
|---|---|
| Coefficient of variation (CV) | 0.6989096534 |
| Kurtosis | -1.464166924 |
| Mean | 1341.57736 |
| Median Absolute Deviation (MAD) | 20 |
| Skewness | -0.7316760076 |
| Sum | 77258757 |
| Variance | 879171.3343 |
| Value | Count | Frequency (%) | |
| 0 | 18897 | 32.8% | |
| 2010 | 2645 | 4.6% | |
| 2008 | 2613 | 4.5% | |
| 2009 | 2533 | 4.4% | |
| 2000 | 2091 | 3.6% | |
| 2007 | 1587 | 2.8% | |
| 2006 | 1471 | 2.6% | |
| 2003 | 1286 | 2.2% | |
| 2011 | 1256 | 2.2% | |
| 2004 | 1123 | 2.0% | |
| 2012 | 1084 | 1.9% | |
| 2002 | 1075 | 1.9% | |
| 1978 | 1037 | 1.8% | |
| 1995 | 1014 | 1.8% | |
| 2005 | 1011 | 1.8% | |
| 1999 | 979 | 1.7% | |
| 1998 | 966 | 1.7% | |
| 1990 | 954 | 1.7% | |
| 1985 | 945 | 1.6% | |
| 1980 | 811 | 1.4% | |
| 1996 | 811 | 1.4% | |
| 1984 | 779 | 1.4% | |
| 1982 | 744 | 1.3% | |
| 1994 | 738 | 1.3% | |
| 1972 | 708 | 1.2% | |
| Other values (30) | 8430 | 14.6% |
| Value | Count | Frequency (%) | |
| 0 | 18897 | 32.8% | |
| 1960 | 102 | 0.2% | |
| 1961 | 21 | < 0.1% | |
| 1962 | 30 | 0.1% | |
| 1963 | 85 | 0.1% | |
| 1964 | 40 | 0.1% | |
| 1965 | 19 | < 0.1% | |
| 1966 | 17 | < 0.1% | |
| 1967 | 88 | 0.2% | |
| 1968 | 77 | 0.1% |
| Value | Count | Frequency (%) | |
| 2013 | 176 | 0.3% | |
| 2012 | 1084 | 1.9% | |
| 2011 | 1256 | 2.2% | |
| 2010 | 2645 | 4.6% | |
| 2009 | 2533 | 4.4% | |
| 2008 | 2613 | 4.5% | |
| 2007 | 1587 | 2.8% | |
| 2006 | 1471 | 2.6% | |
| 2005 | 1011 | 1.8% | |
| 2004 | 1123 | 2.0% |
| Distinct count | 18 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | 4688 |
| swn 80 | 3448 |
| Other values (13) |
| Value | Count | Frequency (%) | |
| gravity | 26696 | 46.4% | |
| nira/tanira | 7361 | 12.8% | |
| other | 6160 | 10.7% | |
| submersible | 4688 | 8.1% | |
| swn 80 | 3448 | 6.0% | |
| mono | 2817 | 4.9% | |
| india mark ii | 2284 | 4.0% | |
| afridev | 1659 | 2.9% | |
| ksb | 1358 | 2.4% | |
| other - rope pump | 451 | 0.8% | |
| other - swn 81 | 229 | 0.4% | |
| windmill | 117 | 0.2% | |
| india mark iii | 91 | 0.2% | |
| cemo | 90 | 0.2% | |
| other - play pump | 85 | 0.1% | |
| climax | 32 | 0.1% | |
| walimi | 20 | < 0.1% | |
| other - mkulima/shinyanga | 2 | < 0.1% |
Length
| Max length | 25 |
|---|---|
| Median length | 7 |
| Mean length | 7.689032437 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| i | 57666 | 13.0% | |
| r | 57518 | 13.0% | |
| a | 55331 | 12.5% | |
| t | 40984 | 9.3% | |
| v | 28355 | 6.4% | |
| y | 26783 | 6.0% | |
| g | 26698 | 6.0% | |
| n | 23712 | 5.4% | |
| e | 18503 | 4.2% | |
| s | 14413 | 3.3% | |
| o | 13102 | 3.0% | |
| b | 10734 | 2.4% | |
| m | 10679 | 2.4% | |
| 10497 | 2.4% | ||
| / | 7363 | 1.7% | |
| h | 6929 | 1.6% | |
| u | 5226 | 1.2% | |
| l | 5061 | 1.1% | |
| d | 4151 | 0.9% | |
| w | 3814 | 0.9% | |
| k | 3735 | 0.8% | |
| 8 | 3677 | 0.8% | |
| 0 | 3448 | 0.8% | |
| f | 1659 | 0.4% | |
| p | 1608 | 0.4% | |
| Other values (4) | 1150 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 416815 | 94.1% | |
| Space Separator | 10497 | 2.4% | |
| Other Punctuation | 7363 | 1.7% | |
| Decimal Number | 7354 | 1.7% | |
| Dash Punctuation | 767 | 0.2% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| i | 57666 | 13.8% | |
| r | 57518 | 13.8% | |
| a | 55331 | 13.3% | |
| t | 40984 | 9.8% | |
| v | 28355 | 6.8% | |
| y | 26783 | 6.4% | |
| g | 26698 | 6.4% | |
| n | 23712 | 5.7% | |
| e | 18503 | 4.4% | |
| s | 14413 | 3.5% | |
| o | 13102 | 3.1% | |
| b | 10734 | 2.6% | |
| m | 10679 | 2.6% | |
| h | 6929 | 1.7% | |
| u | 5226 | 1.3% | |
| l | 5061 | 1.2% | |
| d | 4151 | 1.0% | |
| w | 3814 | 0.9% | |
| k | 3735 | 0.9% | |
| f | 1659 | 0.4% | |
| p | 1608 | 0.4% | |
| c | 122 | < 0.1% | |
| x | 32 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 10497 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 8 | 3677 | 50.0% | |
| 0 | 3448 | 46.9% | |
| 1 | 229 | 3.1% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 7363 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 767 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 416815 | 94.1% | |
| Common | 25981 | 5.9% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| i | 57666 | 13.8% | |
| r | 57518 | 13.8% | |
| a | 55331 | 13.3% | |
| t | 40984 | 9.8% | |
| v | 28355 | 6.8% | |
| y | 26783 | 6.4% | |
| g | 26698 | 6.4% | |
| n | 23712 | 5.7% | |
| e | 18503 | 4.4% | |
| s | 14413 | 3.5% | |
| o | 13102 | 3.1% | |
| b | 10734 | 2.6% | |
| m | 10679 | 2.6% | |
| h | 6929 | 1.7% | |
| u | 5226 | 1.3% | |
| l | 5061 | 1.2% | |
| d | 4151 | 1.0% | |
| w | 3814 | 0.9% | |
| k | 3735 | 0.9% | |
| f | 1659 | 0.4% | |
| p | 1608 | 0.4% | |
| c | 122 | < 0.1% | |
| x | 32 | < 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 10497 | 40.4% | ||
| / | 7363 | 28.3% | |
| 8 | 3677 | 14.2% | |
| 0 | 3448 | 13.3% | |
| - | 767 | 3.0% | |
| 1 | 229 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 442796 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| i | 57666 | 13.0% | |
| r | 57518 | 13.0% | |
| a | 55331 | 12.5% | |
| t | 40984 | 9.3% | |
| v | 28355 | 6.4% | |
| y | 26783 | 6.0% | |
| g | 26698 | 6.0% | |
| n | 23712 | 5.4% | |
| e | 18503 | 4.2% | |
| s | 14413 | 3.3% | |
| o | 13102 | 3.0% | |
| b | 10734 | 2.4% | |
| m | 10679 | 2.4% | |
| 10497 | 2.4% | ||
| / | 7363 | 1.7% | |
| h | 6929 | 1.6% | |
| u | 5226 | 1.2% | |
| l | 5061 | 1.1% | |
| d | 4151 | 0.9% | |
| w | 3814 | 0.9% | |
| k | 3735 | 0.8% | |
| 8 | 3677 | 0.8% | |
| 0 | 3448 | 0.8% | |
| f | 1659 | 0.4% | |
| p | 1608 | 0.4% | |
| Other values (4) | 1150 | 0.3% |
| Distinct count | 13 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | 3448 |
| Other values (8) |
| Value | Count | Frequency (%) | |
| gravity | 26696 | 46.4% | |
| nira/tanira | 7361 | 12.8% | |
| other | 6160 | 10.7% | |
| submersible | 6046 | 10.5% | |
| swn 80 | 3448 | 6.0% | |
| mono | 2817 | 4.9% | |
| india mark ii | 2284 | 4.0% | |
| afridev | 1659 | 2.9% | |
| rope pump | 451 | 0.8% | |
| other handpump | 336 | 0.6% | |
| other motorpump | 122 | 0.2% | |
| wind-powered | 117 | 0.2% | |
| india mark iii | 91 | 0.2% |
Length
| Max length | 15 |
|---|---|
| Median length | 7 |
| Mean length | 7.843318052 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| i | 58831 | 13.0% | |
| r | 58806 | 13.0% | |
| a | 55524 | 12.3% | |
| t | 40797 | 9.0% | |
| v | 28355 | 6.3% | |
| g | 26696 | 5.9% | |
| y | 26696 | 5.9% | |
| n | 23815 | 5.3% | |
| e | 21054 | 4.7% | |
| s | 15540 | 3.4% | |
| o | 13064 | 2.9% | |
| m | 12269 | 2.7% | |
| b | 12092 | 2.7% | |
| 9107 | 2.0% | ||
| / | 7361 | 1.6% | |
| u | 6955 | 1.5% | |
| h | 6954 | 1.5% | |
| l | 6046 | 1.3% | |
| d | 4604 | 1.0% | |
| w | 3682 | 0.8% | |
| 8 | 3448 | 0.8% | |
| 0 | 3448 | 0.8% | |
| p | 2386 | 0.5% | |
| k | 2375 | 0.5% | |
| f | 1659 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 428200 | 94.8% | |
| Space Separator | 9107 | 2.0% | |
| Other Punctuation | 7361 | 1.6% | |
| Decimal Number | 6896 | 1.5% | |
| Dash Punctuation | 117 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| i | 58831 | 13.7% | |
| r | 58806 | 13.7% | |
| a | 55524 | 13.0% | |
| t | 40797 | 9.5% | |
| v | 28355 | 6.6% | |
| g | 26696 | 6.2% | |
| y | 26696 | 6.2% | |
| n | 23815 | 5.6% | |
| e | 21054 | 4.9% | |
| s | 15540 | 3.6% | |
| o | 13064 | 3.1% | |
| m | 12269 | 2.9% | |
| b | 12092 | 2.8% | |
| u | 6955 | 1.6% | |
| h | 6954 | 1.6% | |
| l | 6046 | 1.4% | |
| d | 4604 | 1.1% | |
| w | 3682 | 0.9% | |
| p | 2386 | 0.6% | |
| k | 2375 | 0.6% | |
| f | 1659 | 0.4% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 9107 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 8 | 3448 | 50.0% | |
| 0 | 3448 | 50.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 7361 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 117 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 428200 | 94.8% | |
| Common | 23481 | 5.2% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| i | 58831 | 13.7% | |
| r | 58806 | 13.7% | |
| a | 55524 | 13.0% | |
| t | 40797 | 9.5% | |
| v | 28355 | 6.6% | |
| g | 26696 | 6.2% | |
| y | 26696 | 6.2% | |
| n | 23815 | 5.6% | |
| e | 21054 | 4.9% | |
| s | 15540 | 3.6% | |
| o | 13064 | 3.1% | |
| m | 12269 | 2.9% | |
| b | 12092 | 2.8% | |
| u | 6955 | 1.6% | |
| h | 6954 | 1.6% | |
| l | 6046 | 1.4% | |
| d | 4604 | 1.1% | |
| w | 3682 | 0.9% | |
| p | 2386 | 0.6% | |
| k | 2375 | 0.6% | |
| f | 1659 | 0.4% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 9107 | 38.8% | ||
| / | 7361 | 31.3% | |
| 8 | 3448 | 14.7% | |
| 0 | 3448 | 14.7% | |
| - | 117 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 451681 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| i | 58831 | 13.0% | |
| r | 58806 | 13.0% | |
| a | 55524 | 12.3% | |
| t | 40797 | 9.0% | |
| v | 28355 | 6.3% | |
| g | 26696 | 5.9% | |
| y | 26696 | 5.9% | |
| n | 23815 | 5.3% | |
| e | 21054 | 4.7% | |
| s | 15540 | 3.4% | |
| o | 13064 | 2.9% | |
| m | 12269 | 2.7% | |
| b | 12092 | 2.7% | |
| 9107 | 2.0% | ||
| / | 7361 | 1.6% | |
| u | 6955 | 1.5% | |
| h | 6954 | 1.5% | |
| l | 6046 | 1.3% | |
| d | 4604 | 1.0% | |
| w | 3682 | 0.8% | |
| 8 | 3448 | 0.8% | |
| 0 | 3448 | 0.8% | |
| p | 2386 | 0.5% | |
| k | 2375 | 0.5% | |
| f | 1659 | 0.4% |
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| gravity | |
|---|---|
| handpump | |
| other | |
| submersible | |
| motorpump | 2939 |
| Other values (2) | 568 |
| Value | Count | Frequency (%) | |
| gravity | 26696 | 46.4% | |
| handpump | 15179 | 26.4% | |
| other | 6160 | 10.7% | |
| submersible | 6046 | 10.5% | |
| motorpump | 2939 | 5.1% | |
| rope pump | 451 | 0.8% | |
| wind-powered | 117 | 0.2% |
Length
| Max length | 12 |
|---|---|
| Median length | 7 |
| Mean length | 7.597485587 |
| Min length | 5 |
Most occurring characters
| Value | Count | Frequency (%) | |
| r | 42409 | 9.7% | |
| a | 41875 | 9.6% | |
| p | 37706 | 8.6% | |
| t | 35795 | 8.2% | |
| i | 32859 | 7.5% | |
| m | 27554 | 6.3% | |
| g | 26696 | 6.1% | |
| v | 26696 | 6.1% | |
| y | 26696 | 6.1% | |
| u | 24615 | 5.6% | |
| h | 21339 | 4.9% | |
| e | 18937 | 4.3% | |
| d | 15413 | 3.5% | |
| n | 15296 | 3.5% | |
| o | 12606 | 2.9% | |
| s | 12092 | 2.8% | |
| b | 12092 | 2.8% | |
| l | 6046 | 1.4% | |
| 451 | 0.1% | ||
| w | 234 | 0.1% | |
| - | 117 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 436956 | 99.9% | |
| Space Separator | 451 | 0.1% | |
| Dash Punctuation | 117 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| r | 42409 | 9.7% | |
| a | 41875 | 9.6% | |
| p | 37706 | 8.6% | |
| t | 35795 | 8.2% | |
| i | 32859 | 7.5% | |
| m | 27554 | 6.3% | |
| g | 26696 | 6.1% | |
| v | 26696 | 6.1% | |
| y | 26696 | 6.1% | |
| u | 24615 | 5.6% | |
| h | 21339 | 4.9% | |
| e | 18937 | 4.3% | |
| d | 15413 | 3.5% | |
| n | 15296 | 3.5% | |
| o | 12606 | 2.9% | |
| s | 12092 | 2.8% | |
| b | 12092 | 2.8% | |
| l | 6046 | 1.4% | |
| w | 234 | 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 117 | 100.0% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 451 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 436956 | 99.9% | |
| Common | 568 | 0.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| r | 42409 | 9.7% | |
| a | 41875 | 9.6% | |
| p | 37706 | 8.6% | |
| t | 35795 | 8.2% | |
| i | 32859 | 7.5% | |
| m | 27554 | 6.3% | |
| g | 26696 | 6.1% | |
| v | 26696 | 6.1% | |
| y | 26696 | 6.1% | |
| u | 24615 | 5.6% | |
| h | 21339 | 4.9% | |
| e | 18937 | 4.3% | |
| d | 15413 | 3.5% | |
| n | 15296 | 3.5% | |
| o | 12606 | 2.9% | |
| s | 12092 | 2.8% | |
| b | 12092 | 2.8% | |
| l | 6046 | 1.4% | |
| w | 234 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 451 | 79.4% | ||
| - | 117 | 20.6% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 437524 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| r | 42409 | 9.7% | |
| a | 41875 | 9.6% | |
| p | 37706 | 8.6% | |
| t | 35795 | 8.2% | |
| i | 32859 | 7.5% | |
| m | 27554 | 6.3% | |
| g | 26696 | 6.1% | |
| v | 26696 | 6.1% | |
| y | 26696 | 6.1% | |
| u | 24615 | 5.6% | |
| h | 21339 | 4.9% | |
| e | 18937 | 4.3% | |
| d | 15413 | 3.5% | |
| n | 15296 | 3.5% | |
| o | 12606 | 2.9% | |
| s | 12092 | 2.8% | |
| b | 12092 | 2.8% | |
| l | 6046 | 1.4% | |
| 451 | 0.1% | ||
| w | 234 | 0.1% | |
| - | 117 | < 0.1% |
| Distinct count | 12 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| vwc | |
|---|---|
| wug | 5556 |
| water board | 2932 |
| wua | 2533 |
| private operator | 1970 |
| Other values (7) | 4851 |
| Value | Count | Frequency (%) | |
| vwc | 39746 | 69.0% | |
| wug | 5556 | 9.6% | |
| water board | 2932 | 5.1% | |
| wua | 2533 | 4.4% | |
| private operator | 1970 | 3.4% | |
| parastatal | 1696 | 2.9% | |
| water authority | 902 | 1.6% | |
| other | 840 | 1.5% | |
| company | 685 | 1.2% | |
| unknown | 551 | 1.0% | |
| other - school | 99 | 0.2% | |
| trust | 78 | 0.1% |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.382770716 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| w | 52220 | 20.7% | |
| v | 41716 | 16.5% | |
| c | 40530 | 16.1% | |
| a | 21610 | 8.6% | |
| r | 16291 | 6.5% | |
| t | 14065 | 5.6% | |
| o | 10147 | 4.0% | |
| u | 9620 | 3.8% | |
| e | 8713 | 3.5% | |
| p | 6321 | 2.5% | |
| 6002 | 2.4% | ||
| g | 5556 | 2.2% | |
| b | 2932 | 1.2% | |
| d | 2932 | 1.2% | |
| i | 2872 | 1.1% | |
| n | 2338 | 0.9% | |
| h | 1940 | 0.8% | |
| s | 1873 | 0.7% | |
| l | 1795 | 0.7% | |
| y | 1587 | 0.6% | |
| m | 685 | 0.3% | |
| k | 551 | 0.2% | |
| - | 99 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 246294 | 97.6% | |
| Space Separator | 6002 | 2.4% | |
| Dash Punctuation | 99 | < 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| w | 52220 | 21.2% | |
| v | 41716 | 16.9% | |
| c | 40530 | 16.5% | |
| a | 21610 | 8.8% | |
| r | 16291 | 6.6% | |
| t | 14065 | 5.7% | |
| o | 10147 | 4.1% | |
| u | 9620 | 3.9% | |
| e | 8713 | 3.5% | |
| p | 6321 | 2.6% | |
| g | 5556 | 2.3% | |
| b | 2932 | 1.2% | |
| d | 2932 | 1.2% | |
| i | 2872 | 1.2% | |
| n | 2338 | 0.9% | |
| h | 1940 | 0.8% | |
| s | 1873 | 0.8% | |
| l | 1795 | 0.7% | |
| y | 1587 | 0.6% | |
| m | 685 | 0.3% | |
| k | 551 | 0.2% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 6002 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 99 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 246294 | 97.6% | |
| Common | 6101 | 2.4% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| w | 52220 | 21.2% | |
| v | 41716 | 16.9% | |
| c | 40530 | 16.5% | |
| a | 21610 | 8.8% | |
| r | 16291 | 6.6% | |
| t | 14065 | 5.7% | |
| o | 10147 | 4.1% | |
| u | 9620 | 3.9% | |
| e | 8713 | 3.5% | |
| p | 6321 | 2.6% | |
| g | 5556 | 2.3% | |
| b | 2932 | 1.2% | |
| d | 2932 | 1.2% | |
| i | 2872 | 1.2% | |
| n | 2338 | 0.9% | |
| h | 1940 | 0.8% | |
| s | 1873 | 0.8% | |
| l | 1795 | 0.7% | |
| y | 1587 | 0.6% | |
| m | 685 | 0.3% | |
| k | 551 | 0.2% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 6002 | 98.4% | ||
| - | 99 | 1.6% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 252395 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| w | 52220 | 20.7% | |
| v | 41716 | 16.5% | |
| c | 40530 | 16.1% | |
| a | 21610 | 8.6% | |
| r | 16291 | 6.5% | |
| t | 14065 | 5.6% | |
| o | 10147 | 4.0% | |
| u | 9620 | 3.8% | |
| e | 8713 | 3.5% | |
| p | 6321 | 2.5% | |
| 6002 | 2.4% | ||
| g | 5556 | 2.2% | |
| b | 2932 | 1.2% | |
| d | 2932 | 1.2% | |
| i | 2872 | 1.1% | |
| n | 2338 | 0.9% | |
| h | 1940 | 0.8% | |
| s | 1873 | 0.7% | |
| l | 1795 | 0.7% | |
| y | 1587 | 0.6% | |
| m | 685 | 0.3% | |
| k | 551 | 0.2% | |
| - | 99 | < 0.1% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| user-group | |
|---|---|
| commercial | 3635 |
| parastatal | 1696 |
| other | 939 |
| unknown | 551 |
| Value | Count | Frequency (%) | |
| user-group | 50767 | 88.2% | |
| commercial | 3635 | 6.3% | |
| parastatal | 1696 | 2.9% | |
| other | 939 | 1.6% | |
| unknown | 551 | 1.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.889768702 |
| Min length | 5 |
Most occurring characters
| Value | Count | Frequency (%) | |
| r | 107804 | 18.9% | |
| u | 102085 | 17.9% | |
| o | 55892 | 9.8% | |
| e | 55341 | 9.7% | |
| s | 52463 | 9.2% | |
| p | 52463 | 9.2% | |
| - | 50767 | 8.9% | |
| g | 50767 | 8.9% | |
| a | 10419 | 1.8% | |
| c | 7270 | 1.3% | |
| m | 7270 | 1.3% | |
| l | 5331 | 0.9% | |
| t | 4331 | 0.8% | |
| i | 3635 | 0.6% | |
| n | 1653 | 0.3% | |
| h | 939 | 0.2% | |
| k | 551 | 0.1% | |
| w | 551 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 518765 | 91.1% | |
| Dash Punctuation | 50767 | 8.9% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| r | 107804 | 20.8% | |
| u | 102085 | 19.7% | |
| o | 55892 | 10.8% | |
| e | 55341 | 10.7% | |
| s | 52463 | 10.1% | |
| p | 52463 | 10.1% | |
| g | 50767 | 9.8% | |
| a | 10419 | 2.0% | |
| c | 7270 | 1.4% | |
| m | 7270 | 1.4% | |
| l | 5331 | 1.0% | |
| t | 4331 | 0.8% | |
| i | 3635 | 0.7% | |
| n | 1653 | 0.3% | |
| h | 939 | 0.2% | |
| k | 551 | 0.1% | |
| w | 551 | 0.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 50767 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 518765 | 91.1% | |
| Common | 50767 | 8.9% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| r | 107804 | 20.8% | |
| u | 102085 | 19.7% | |
| o | 55892 | 10.8% | |
| e | 55341 | 10.7% | |
| s | 52463 | 10.1% | |
| p | 52463 | 10.1% | |
| g | 50767 | 9.8% | |
| a | 10419 | 2.0% | |
| c | 7270 | 1.4% | |
| m | 7270 | 1.4% | |
| l | 5331 | 1.0% | |
| t | 4331 | 0.8% | |
| i | 3635 | 0.7% | |
| n | 1653 | 0.3% | |
| h | 939 | 0.2% | |
| k | 551 | 0.1% | |
| w | 551 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| - | 50767 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 569532 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| r | 107804 | 18.9% | |
| u | 102085 | 17.9% | |
| o | 55892 | 9.8% | |
| e | 55341 | 9.7% | |
| s | 52463 | 9.2% | |
| p | 52463 | 9.2% | |
| - | 50767 | 8.9% | |
| g | 50767 | 8.9% | |
| a | 10419 | 1.8% | |
| c | 7270 | 1.3% | |
| m | 7270 | 1.3% | |
| l | 5331 | 0.9% | |
| t | 4331 | 0.8% | |
| i | 3635 | 0.6% | |
| n | 1653 | 0.3% | |
| h | 939 | 0.2% | |
| k | 551 | 0.1% | |
| w | 551 | 0.1% |
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| never pay | |
|---|---|
| pay per bucket | |
| pay monthly | |
| unknown | |
| pay when scheme fails | 3843 |
| Other values (2) | 4529 |
| Value | Count | Frequency (%) | |
| never pay | 24380 | 42.3% | |
| pay per bucket | 8953 | 15.5% | |
| pay monthly | 8229 | 14.3% | |
| unknown | 7654 | 13.3% | |
| pay when scheme fails | 3843 | 6.7% | |
| pay annually | 3626 | 6.3% | |
| other | 903 | 1.6% |
Length
| Max length | 21 |
|---|---|
| Median length | 9 |
| Mean length | 10.72426547 |
| Min length | 5 |
Most occurring characters
| Value | Count | Frequency (%) | |
| e | 79098 | 12.8% | |
| n | 66666 | 10.8% | |
| 65670 | 10.6% | ||
| y | 60886 | 9.9% | |
| a | 60126 | 9.7% | |
| p | 57984 | 9.4% | |
| r | 34236 | 5.5% | |
| v | 24380 | 3.9% | |
| u | 20233 | 3.3% | |
| l | 19324 | 3.1% | |
| t | 18085 | 2.9% | |
| h | 16818 | 2.7% | |
| o | 16786 | 2.7% | |
| k | 16607 | 2.7% | |
| c | 12796 | 2.1% | |
| m | 12072 | 2.0% | |
| w | 11497 | 1.9% | |
| b | 8953 | 1.4% | |
| s | 7686 | 1.2% | |
| f | 3843 | 0.6% | |
| i | 3843 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 551919 | 89.4% | |
| Space Separator | 65670 | 10.6% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 79098 | 14.3% | |
| n | 66666 | 12.1% | |
| y | 60886 | 11.0% | |
| a | 60126 | 10.9% | |
| p | 57984 | 10.5% | |
| r | 34236 | 6.2% | |
| v | 24380 | 4.4% | |
| u | 20233 | 3.7% | |
| l | 19324 | 3.5% | |
| t | 18085 | 3.3% | |
| h | 16818 | 3.0% | |
| o | 16786 | 3.0% | |
| k | 16607 | 3.0% | |
| c | 12796 | 2.3% | |
| m | 12072 | 2.2% | |
| w | 11497 | 2.1% | |
| b | 8953 | 1.6% | |
| s | 7686 | 1.4% | |
| f | 3843 | 0.7% | |
| i | 3843 | 0.7% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 65670 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 551919 | 89.4% | |
| Common | 65670 | 10.6% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 79098 | 14.3% | |
| n | 66666 | 12.1% | |
| y | 60886 | 11.0% | |
| a | 60126 | 10.9% | |
| p | 57984 | 10.5% | |
| r | 34236 | 6.2% | |
| v | 24380 | 4.4% | |
| u | 20233 | 3.7% | |
| l | 19324 | 3.5% | |
| t | 18085 | 3.3% | |
| h | 16818 | 3.0% | |
| o | 16786 | 3.0% | |
| k | 16607 | 3.0% | |
| c | 12796 | 2.3% | |
| m | 12072 | 2.2% | |
| w | 11497 | 2.1% | |
| b | 8953 | 1.6% | |
| s | 7686 | 1.4% | |
| f | 3843 | 0.7% | |
| i | 3843 | 0.7% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 65670 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 617589 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| e | 79098 | 12.8% | |
| n | 66666 | 10.8% | |
| 65670 | 10.6% | ||
| y | 60886 | 9.9% | |
| a | 60126 | 9.7% | |
| p | 57984 | 9.4% | |
| r | 34236 | 5.5% | |
| v | 24380 | 3.9% | |
| u | 20233 | 3.3% | |
| l | 19324 | 3.1% | |
| t | 18085 | 2.9% | |
| h | 16818 | 2.7% | |
| o | 16786 | 2.7% | |
| k | 16607 | 2.7% | |
| c | 12796 | 2.1% | |
| m | 12072 | 2.0% | |
| w | 11497 | 1.9% | |
| b | 8953 | 1.4% | |
| s | 7686 | 1.2% | |
| f | 3843 | 0.6% | |
| i | 3843 | 0.6% |
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| never pay | |
|---|---|
| per bucket | |
| monthly | |
| unknown | |
| on failure | 3843 |
| Other values (2) | 4529 |
| Value | Count | Frequency (%) | |
| never pay | 24380 | 42.3% | |
| per bucket | 8953 | 15.5% | |
| monthly | 8229 | 14.3% | |
| unknown | 7654 | 13.3% | |
| on failure | 3843 | 6.7% | |
| annually | 3626 | 6.3% | |
| other | 903 | 1.6% |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.544905189 |
| Min length | 5 |
Most occurring characters
| Value | Count | Frequency (%) | |
| e | 71412 | 14.5% | |
| n | 66666 | 13.5% | |
| r | 38079 | 7.7% | |
| 37176 | 7.6% | ||
| y | 36235 | 7.4% | |
| a | 35475 | 7.2% | |
| p | 33333 | 6.8% | |
| v | 24380 | 5.0% | |
| u | 24076 | 4.9% | |
| o | 20629 | 4.2% | |
| l | 19324 | 3.9% | |
| t | 18085 | 3.7% | |
| k | 16607 | 3.4% | |
| h | 9132 | 1.9% | |
| b | 8953 | 1.8% | |
| c | 8953 | 1.8% | |
| m | 8229 | 1.7% | |
| w | 7654 | 1.6% | |
| f | 3843 | 0.8% | |
| i | 3843 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 454908 | 92.4% | |
| Space Separator | 37176 | 7.6% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 71412 | 15.7% | |
| n | 66666 | 14.7% | |
| r | 38079 | 8.4% | |
| y | 36235 | 8.0% | |
| a | 35475 | 7.8% | |
| p | 33333 | 7.3% | |
| v | 24380 | 5.4% | |
| u | 24076 | 5.3% | |
| o | 20629 | 4.5% | |
| l | 19324 | 4.2% | |
| t | 18085 | 4.0% | |
| k | 16607 | 3.7% | |
| h | 9132 | 2.0% | |
| b | 8953 | 2.0% | |
| c | 8953 | 2.0% | |
| m | 8229 | 1.8% | |
| w | 7654 | 1.7% | |
| f | 3843 | 0.8% | |
| i | 3843 | 0.8% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 37176 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 454908 | 92.4% | |
| Common | 37176 | 7.6% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 71412 | 15.7% | |
| n | 66666 | 14.7% | |
| r | 38079 | 8.4% | |
| y | 36235 | 8.0% | |
| a | 35475 | 7.8% | |
| p | 33333 | 7.3% | |
| v | 24380 | 5.4% | |
| u | 24076 | 5.3% | |
| o | 20629 | 4.5% | |
| l | 19324 | 4.2% | |
| t | 18085 | 4.0% | |
| k | 16607 | 3.7% | |
| h | 9132 | 2.0% | |
| b | 8953 | 2.0% | |
| c | 8953 | 2.0% | |
| m | 8229 | 1.8% | |
| w | 7654 | 1.7% | |
| f | 3843 | 0.8% | |
| i | 3843 | 0.8% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 37176 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 492084 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| e | 71412 | 14.5% | |
| n | 66666 | 13.5% | |
| r | 38079 | 7.7% | |
| 37176 | 7.6% | ||
| y | 36235 | 7.4% | |
| a | 35475 | 7.2% | |
| p | 33333 | 6.8% | |
| v | 24380 | 5.0% | |
| u | 24076 | 4.9% | |
| o | 20629 | 4.2% | |
| l | 19324 | 3.9% | |
| t | 18085 | 3.7% | |
| k | 16607 | 3.4% | |
| h | 9132 | 1.9% | |
| b | 8953 | 1.8% | |
| c | 8953 | 1.8% | |
| m | 8229 | 1.7% | |
| w | 7654 | 1.6% | |
| f | 3843 | 0.8% | |
| i | 3843 | 0.8% |
| Distinct count | 8 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| soft | |
|---|---|
| salty | 4772 |
| unknown | 1661 |
| milky | 803 |
| coloured | 479 |
| Other values (3) | 442 |
| Value | Count | Frequency (%) | |
| soft | 49431 | 85.8% | |
| salty | 4772 | 8.3% | |
| unknown | 1661 | 2.9% | |
| milky | 803 | 1.4% | |
| coloured | 479 | 0.8% | |
| salty abandoned | 228 | 0.4% | |
| fluoride | 199 | 0.3% | |
| fluoride abandoned | 15 | < 0.1% |
Length
| Max length | 18 |
|---|---|
| Median length | 4 |
| Mean length | 4.277627283 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| s | 54431 | 22.1% | |
| t | 54431 | 22.1% | |
| o | 52507 | 21.3% | |
| f | 49645 | 20.2% | |
| l | 6496 | 2.6% | |
| y | 5803 | 2.4% | |
| a | 5486 | 2.2% | |
| n | 5469 | 2.2% | |
| k | 2464 | 1.0% | |
| u | 2354 | 1.0% | |
| w | 1661 | 0.7% | |
| d | 1179 | 0.5% | |
| i | 1017 | 0.4% | |
| e | 936 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| c | 479 | 0.2% | |
| 243 | 0.1% | ||
| b | 243 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 246097 | 99.9% | |
| Space Separator | 243 | 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| s | 54431 | 22.1% | |
| t | 54431 | 22.1% | |
| o | 52507 | 21.3% | |
| f | 49645 | 20.2% | |
| l | 6496 | 2.6% | |
| y | 5803 | 2.4% | |
| a | 5486 | 2.2% | |
| n | 5469 | 2.2% | |
| k | 2464 | 1.0% | |
| u | 2354 | 1.0% | |
| w | 1661 | 0.7% | |
| d | 1179 | 0.5% | |
| i | 1017 | 0.4% | |
| e | 936 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| c | 479 | 0.2% | |
| b | 243 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 243 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 246097 | 99.9% | |
| Common | 243 | 0.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| s | 54431 | 22.1% | |
| t | 54431 | 22.1% | |
| o | 52507 | 21.3% | |
| f | 49645 | 20.2% | |
| l | 6496 | 2.6% | |
| y | 5803 | 2.4% | |
| a | 5486 | 2.2% | |
| n | 5469 | 2.2% | |
| k | 2464 | 1.0% | |
| u | 2354 | 1.0% | |
| w | 1661 | 0.7% | |
| d | 1179 | 0.5% | |
| i | 1017 | 0.4% | |
| e | 936 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| c | 479 | 0.2% | |
| b | 243 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 243 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 246340 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| s | 54431 | 22.1% | |
| t | 54431 | 22.1% | |
| o | 52507 | 21.3% | |
| f | 49645 | 20.2% | |
| l | 6496 | 2.6% | |
| y | 5803 | 2.4% | |
| a | 5486 | 2.2% | |
| n | 5469 | 2.2% | |
| k | 2464 | 1.0% | |
| u | 2354 | 1.0% | |
| w | 1661 | 0.7% | |
| d | 1179 | 0.5% | |
| i | 1017 | 0.4% | |
| e | 936 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| c | 479 | 0.2% | |
| 243 | 0.1% | ||
| b | 243 | 0.1% |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| good | |
|---|---|
| salty | 5000 |
| unknown | 1661 |
| milky | 803 |
| colored | 479 |
| Value | Count | Frequency (%) | |
| good | 49431 | 85.8% | |
| salty | 5000 | 8.7% | |
| unknown | 1661 | 2.9% | |
| milky | 803 | 1.4% | |
| colored | 479 | 0.8% | |
| fluoride | 214 | 0.4% |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 4.227113287 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| o | 101695 | 41.8% | |
| d | 50124 | 20.6% | |
| g | 49431 | 20.3% | |
| l | 6496 | 2.7% | |
| y | 5803 | 2.4% | |
| s | 5000 | 2.1% | |
| a | 5000 | 2.1% | |
| t | 5000 | 2.1% | |
| n | 4983 | 2.0% | |
| k | 2464 | 1.0% | |
| u | 1875 | 0.8% | |
| w | 1661 | 0.7% | |
| i | 1017 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| e | 693 | 0.3% | |
| c | 479 | 0.2% | |
| f | 214 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 243431 | 100.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| o | 101695 | 41.8% | |
| d | 50124 | 20.6% | |
| g | 49431 | 20.3% | |
| l | 6496 | 2.7% | |
| y | 5803 | 2.4% | |
| s | 5000 | 2.1% | |
| a | 5000 | 2.1% | |
| t | 5000 | 2.1% | |
| n | 4983 | 2.0% | |
| k | 2464 | 1.0% | |
| u | 1875 | 0.8% | |
| w | 1661 | 0.7% | |
| i | 1017 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| e | 693 | 0.3% | |
| c | 479 | 0.2% | |
| f | 214 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 243431 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| o | 101695 | 41.8% | |
| d | 50124 | 20.6% | |
| g | 49431 | 20.3% | |
| l | 6496 | 2.7% | |
| y | 5803 | 2.4% | |
| s | 5000 | 2.1% | |
| a | 5000 | 2.1% | |
| t | 5000 | 2.1% | |
| n | 4983 | 2.0% | |
| k | 2464 | 1.0% | |
| u | 1875 | 0.8% | |
| w | 1661 | 0.7% | |
| i | 1017 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| e | 693 | 0.3% | |
| c | 479 | 0.2% | |
| f | 214 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 243431 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| o | 101695 | 41.8% | |
| d | 50124 | 20.6% | |
| g | 49431 | 20.3% | |
| l | 6496 | 2.7% | |
| y | 5803 | 2.4% | |
| s | 5000 | 2.1% | |
| a | 5000 | 2.1% | |
| t | 5000 | 2.1% | |
| n | 4983 | 2.0% | |
| k | 2464 | 1.0% | |
| u | 1875 | 0.8% | |
| w | 1661 | 0.7% | |
| i | 1017 | 0.4% | |
| m | 803 | 0.3% | |
| r | 693 | 0.3% | |
| e | 693 | 0.3% | |
| c | 479 | 0.2% | |
| f | 214 | 0.1% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | 5990 |
| seasonal | 4001 |
| unknown | 773 |
| Value | Count | Frequency (%) | |
| enough | 32260 | 56.0% | |
| insufficient | 14564 | 25.3% | |
| dry | 5990 | 10.4% | |
| seasonal | 4001 | 6.9% | |
| unknown | 773 | 1.3% |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.357730777 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 423717 | 100.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 423717 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 423717 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | 5990 |
| seasonal | 4001 |
| unknown | 773 |
| Value | Count | Frequency (%) | |
| enough | 32260 | 56.0% | |
| insufficient | 14564 | 25.3% | |
| dry | 5990 | 10.4% | |
| seasonal | 4001 | 6.9% | |
| unknown | 773 | 1.3% |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.357730777 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 423717 | 100.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 423717 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 423717 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| n | 67708 | 16.0% | |
| e | 50825 | 12.0% | |
| u | 47597 | 11.2% | |
| i | 43692 | 10.3% | |
| o | 37034 | 8.7% | |
| g | 32260 | 7.6% | |
| h | 32260 | 7.6% | |
| f | 29128 | 6.9% | |
| s | 22566 | 5.3% | |
| c | 14564 | 3.4% | |
| t | 14564 | 3.4% | |
| a | 8002 | 1.9% | |
| d | 5990 | 1.4% | |
| r | 5990 | 1.4% | |
| y | 5990 | 1.4% | |
| l | 4001 | 0.9% | |
| k | 773 | 0.2% | |
| w | 773 | 0.2% |
| Distinct count | 10 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| spring | |
|---|---|
| shallow well | |
| machine dbh | |
| river | |
| rainwater harvesting | 2218 |
| Other values (5) | 2427 |
| Value | Count | Frequency (%) | |
| spring | 17006 | 29.5% | |
| shallow well | 15499 | 26.9% | |
| machine dbh | 10826 | 18.8% | |
| river | 9612 | 16.7% | |
| rainwater harvesting | 2218 | 3.9% | |
| hand dtw | 873 | 1.5% | |
| dam | 649 | 1.1% | |
| lake | 639 | 1.1% | |
| other | 202 | 0.4% | |
| unknown | 64 | 0.1% |
Length
| Max length | 20 |
|---|---|
| Median length | 8 |
| Mean length | 8.898989373 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| l | 62635 | 12.2% | |
| r | 43086 | 8.4% | |
| i | 41880 | 8.2% | |
| e | 41214 | 8.0% | |
| h | 40444 | 7.9% | |
| a | 35140 | 6.9% | |
| s | 34723 | 6.8% | |
| w | 34153 | 6.7% | |
| n | 33333 | 6.5% | |
| 29416 | 5.7% | ||
| g | 19224 | 3.8% | |
| p | 17006 | 3.3% | |
| o | 15765 | 3.1% | |
| d | 13221 | 2.6% | |
| v | 11830 | 2.3% | |
| m | 11475 | 2.2% | |
| c | 10826 | 2.1% | |
| b | 10826 | 2.1% | |
| t | 5511 | 1.1% | |
| k | 703 | 0.1% | |
| u | 64 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 483059 | 94.3% | |
| Space Separator | 29416 | 5.7% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| l | 62635 | 13.0% | |
| r | 43086 | 8.9% | |
| i | 41880 | 8.7% | |
| e | 41214 | 8.5% | |
| h | 40444 | 8.4% | |
| a | 35140 | 7.3% | |
| s | 34723 | 7.2% | |
| w | 34153 | 7.1% | |
| n | 33333 | 6.9% | |
| g | 19224 | 4.0% | |
| p | 17006 | 3.5% | |
| o | 15765 | 3.3% | |
| d | 13221 | 2.7% | |
| v | 11830 | 2.4% | |
| m | 11475 | 2.4% | |
| c | 10826 | 2.2% | |
| b | 10826 | 2.2% | |
| t | 5511 | 1.1% | |
| k | 703 | 0.1% | |
| u | 64 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 29416 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 483059 | 94.3% | |
| Common | 29416 | 5.7% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| l | 62635 | 13.0% | |
| r | 43086 | 8.9% | |
| i | 41880 | 8.7% | |
| e | 41214 | 8.5% | |
| h | 40444 | 8.4% | |
| a | 35140 | 7.3% | |
| s | 34723 | 7.2% | |
| w | 34153 | 7.1% | |
| n | 33333 | 6.9% | |
| g | 19224 | 4.0% | |
| p | 17006 | 3.5% | |
| o | 15765 | 3.3% | |
| d | 13221 | 2.7% | |
| v | 11830 | 2.4% | |
| m | 11475 | 2.4% | |
| c | 10826 | 2.2% | |
| b | 10826 | 2.2% | |
| t | 5511 | 1.1% | |
| k | 703 | 0.1% | |
| u | 64 | < 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 29416 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 512475 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| l | 62635 | 12.2% | |
| r | 43086 | 8.4% | |
| i | 41880 | 8.2% | |
| e | 41214 | 8.0% | |
| h | 40444 | 7.9% | |
| a | 35140 | 6.9% | |
| s | 34723 | 6.8% | |
| w | 34153 | 6.7% | |
| n | 33333 | 6.5% | |
| 29416 | 5.7% | ||
| g | 19224 | 3.8% | |
| p | 17006 | 3.3% | |
| o | 15765 | 3.1% | |
| d | 13221 | 2.6% | |
| v | 11830 | 2.3% | |
| m | 11475 | 2.2% | |
| c | 10826 | 2.1% | |
| b | 10826 | 2.1% | |
| t | 5511 | 1.1% | |
| k | 703 | 0.1% | |
| u | 64 | < 0.1% |
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| spring | |
|---|---|
| shallow well | |
| borehole | |
| river/lake | |
| rainwater harvesting | 2218 |
| Other values (2) | 915 |
| Value | Count | Frequency (%) | |
| spring | 17006 | 29.5% | |
| shallow well | 15499 | 26.9% | |
| borehole | 11699 | 20.3% | |
| river/lake | 10251 | 17.8% | |
| rainwater harvesting | 2218 | 3.9% | |
| dam | 649 | 1.1% | |
| other | 266 | 0.5% |
Length
| Max length | 20 |
|---|---|
| Median length | 8 |
| Mean length | 9.233920261 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| l | 83946 | 15.8% | |
| e | 64101 | 12.1% | |
| r | 56127 | 10.6% | |
| o | 39163 | 7.4% | |
| s | 34723 | 6.5% | |
| w | 33216 | 6.2% | |
| a | 33053 | 6.2% | |
| i | 31693 | 6.0% | |
| h | 29682 | 5.6% | |
| n | 21442 | 4.0% | |
| g | 19224 | 3.6% | |
| 17717 | 3.3% | ||
| p | 17006 | 3.2% | |
| v | 12469 | 2.3% | |
| b | 11699 | 2.2% | |
| / | 10251 | 1.9% | |
| k | 10251 | 1.9% | |
| t | 4702 | 0.9% | |
| d | 649 | 0.1% | |
| m | 649 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 503795 | 94.7% | |
| Space Separator | 17717 | 3.3% | |
| Other Punctuation | 10251 | 1.9% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| l | 83946 | 16.7% | |
| e | 64101 | 12.7% | |
| r | 56127 | 11.1% | |
| o | 39163 | 7.8% | |
| s | 34723 | 6.9% | |
| w | 33216 | 6.6% | |
| a | 33053 | 6.6% | |
| i | 31693 | 6.3% | |
| h | 29682 | 5.9% | |
| n | 21442 | 4.3% | |
| g | 19224 | 3.8% | |
| p | 17006 | 3.4% | |
| v | 12469 | 2.5% | |
| b | 11699 | 2.3% | |
| k | 10251 | 2.0% | |
| t | 4702 | 0.9% | |
| d | 649 | 0.1% | |
| m | 649 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 17717 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 10251 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 503795 | 94.7% | |
| Common | 27968 | 5.3% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| l | 83946 | 16.7% | |
| e | 64101 | 12.7% | |
| r | 56127 | 11.1% | |
| o | 39163 | 7.8% | |
| s | 34723 | 6.9% | |
| w | 33216 | 6.6% | |
| a | 33053 | 6.6% | |
| i | 31693 | 6.3% | |
| h | 29682 | 5.9% | |
| n | 21442 | 4.3% | |
| g | 19224 | 3.8% | |
| p | 17006 | 3.4% | |
| v | 12469 | 2.5% | |
| b | 11699 | 2.3% | |
| k | 10251 | 2.0% | |
| t | 4702 | 0.9% | |
| d | 649 | 0.1% | |
| m | 649 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 17717 | 63.3% | ||
| / | 10251 | 36.7% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 531763 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| l | 83946 | 15.8% | |
| e | 64101 | 12.1% | |
| r | 56127 | 10.6% | |
| o | 39163 | 7.4% | |
| s | 34723 | 6.5% | |
| w | 33216 | 6.2% | |
| a | 33053 | 6.2% | |
| i | 31693 | 6.0% | |
| h | 29682 | 5.6% | |
| n | 21442 | 4.0% | |
| g | 19224 | 3.6% | |
| 17717 | 3.3% | ||
| p | 17006 | 3.2% | |
| v | 12469 | 2.3% | |
| b | 11699 | 2.2% | |
| / | 10251 | 1.9% | |
| k | 10251 | 1.9% | |
| t | 4702 | 0.9% | |
| d | 649 | 0.1% | |
| m | 649 | 0.1% |
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| groundwater | |
|---|---|
| surface | |
| unknown | 266 |
| Value | Count | Frequency (%) | |
| groundwater | 44204 | 76.8% | |
| surface | 13118 | 22.8% | |
| unknown | 266 | 0.5% |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.07036188 |
| Min length | 7 |
Most occurring characters
| Value | Count | Frequency (%) | |
| r | 101526 | 17.5% | |
| u | 57588 | 9.9% | |
| a | 57322 | 9.9% | |
| e | 57322 | 9.9% | |
| n | 45002 | 7.8% | |
| o | 44470 | 7.7% | |
| w | 44470 | 7.7% | |
| g | 44204 | 7.6% | |
| d | 44204 | 7.6% | |
| t | 44204 | 7.6% | |
| s | 13118 | 2.3% | |
| f | 13118 | 2.3% | |
| c | 13118 | 2.3% | |
| k | 266 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 579932 | 100.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| r | 101526 | 17.5% | |
| u | 57588 | 9.9% | |
| a | 57322 | 9.9% | |
| e | 57322 | 9.9% | |
| n | 45002 | 7.8% | |
| o | 44470 | 7.7% | |
| w | 44470 | 7.7% | |
| g | 44204 | 7.6% | |
| d | 44204 | 7.6% | |
| t | 44204 | 7.6% | |
| s | 13118 | 2.3% | |
| f | 13118 | 2.3% | |
| c | 13118 | 2.3% | |
| k | 266 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 579932 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| r | 101526 | 17.5% | |
| u | 57588 | 9.9% | |
| a | 57322 | 9.9% | |
| e | 57322 | 9.9% | |
| n | 45002 | 7.8% | |
| o | 44470 | 7.7% | |
| w | 44470 | 7.7% | |
| g | 44204 | 7.6% | |
| d | 44204 | 7.6% | |
| t | 44204 | 7.6% | |
| s | 13118 | 2.3% | |
| f | 13118 | 2.3% | |
| c | 13118 | 2.3% | |
| k | 266 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 579932 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| r | 101526 | 17.5% | |
| u | 57588 | 9.9% | |
| a | 57322 | 9.9% | |
| e | 57322 | 9.9% | |
| n | 45002 | 7.8% | |
| o | 44470 | 7.7% | |
| w | 44470 | 7.7% | |
| g | 44204 | 7.6% | |
| d | 44204 | 7.6% | |
| t | 44204 | 7.6% | |
| s | 13118 | 2.3% | |
| f | 13118 | 2.3% | |
| c | 13118 | 2.3% | |
| k | 266 | < 0.1% |
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| communal standpipe multiple | |
| improved spring | 783 |
| Other values (2) | 123 |
| Value | Count | Frequency (%) | |
| communal standpipe | 28375 | 49.3% | |
| hand pump | 16181 | 28.1% | |
| other | 6167 | 10.7% | |
| communal standpipe multiple | 5959 | 10.3% | |
| improved spring | 783 | 1.4% | |
| cattle trough | 116 | 0.2% | |
| dam | 7 | < 0.1% |
Length
| Max length | 27 |
|---|---|
| Median length | 18 |
| Mean length | 14.95764743 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| p | 108555 | 12.6% | |
| m | 91598 | 10.6% | |
| n | 85632 | 9.9% | |
| a | 84972 | 9.9% | |
| 57373 | 6.7% | ||
| u | 56590 | 6.6% | |
| d | 51305 | 6.0% | |
| e | 47359 | 5.5% | |
| t | 46808 | 5.4% | |
| l | 46368 | 5.4% | |
| i | 41859 | 4.9% | |
| o | 41400 | 4.8% | |
| s | 35117 | 4.1% | |
| c | 34450 | 4.0% | |
| h | 22464 | 2.6% | |
| r | 7849 | 0.9% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 804008 | 93.3% | |
| Space Separator | 57373 | 6.7% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| p | 108555 | 13.5% | |
| m | 91598 | 11.4% | |
| n | 85632 | 10.7% | |
| a | 84972 | 10.6% | |
| u | 56590 | 7.0% | |
| d | 51305 | 6.4% | |
| e | 47359 | 5.9% | |
| t | 46808 | 5.8% | |
| l | 46368 | 5.8% | |
| i | 41859 | 5.2% | |
| o | 41400 | 5.1% | |
| s | 35117 | 4.4% | |
| c | 34450 | 4.3% | |
| h | 22464 | 2.8% | |
| r | 7849 | 1.0% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 57373 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 804008 | 93.3% | |
| Common | 57373 | 6.7% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| p | 108555 | 13.5% | |
| m | 91598 | 11.4% | |
| n | 85632 | 10.7% | |
| a | 84972 | 10.6% | |
| u | 56590 | 7.0% | |
| d | 51305 | 6.4% | |
| e | 47359 | 5.9% | |
| t | 46808 | 5.8% | |
| l | 46368 | 5.8% | |
| i | 41859 | 5.2% | |
| o | 41400 | 5.1% | |
| s | 35117 | 4.4% | |
| c | 34450 | 4.3% | |
| h | 22464 | 2.8% | |
| r | 7849 | 1.0% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 57373 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 861381 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| p | 108555 | 12.6% | |
| m | 91598 | 10.6% | |
| n | 85632 | 9.9% | |
| a | 84972 | 9.9% | |
| 57373 | 6.7% | ||
| u | 56590 | 6.6% | |
| d | 51305 | 6.0% | |
| e | 47359 | 5.5% | |
| t | 46808 | 5.4% | |
| l | 46368 | 5.4% | |
| i | 41859 | 4.9% | |
| o | 41400 | 4.8% | |
| s | 35117 | 4.1% | |
| c | 34450 | 4.0% | |
| h | 22464 | 2.6% | |
| r | 7849 | 0.9% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | 6167 |
| improved spring | 783 |
| cattle trough | 116 |
| Value | Count | Frequency (%) | |
| communal standpipe | 34334 | 59.6% | |
| hand pump | 16181 | 28.1% | |
| other | 6167 | 10.7% | |
| improved spring | 783 | 1.4% | |
| cattle trough | 116 | 0.2% | |
| dam | 7 | < 0.1% |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 14.02635966 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| p | 102596 | 12.7% | |
| m | 85639 | 10.6% | |
| n | 85632 | 10.6% | |
| a | 84972 | 10.5% | |
| 51414 | 6.4% | ||
| d | 51305 | 6.4% | |
| u | 50631 | 6.3% | |
| o | 41400 | 5.1% | |
| e | 41400 | 5.1% | |
| t | 40849 | 5.1% | |
| i | 35900 | 4.4% | |
| s | 35117 | 4.3% | |
| c | 34450 | 4.3% | |
| l | 34450 | 4.3% | |
| h | 22464 | 2.8% | |
| r | 7849 | 1.0% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 756336 | 93.6% | |
| Space Separator | 51414 | 6.4% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| p | 102596 | 13.6% | |
| m | 85639 | 11.3% | |
| n | 85632 | 11.3% | |
| a | 84972 | 11.2% | |
| d | 51305 | 6.8% | |
| u | 50631 | 6.7% | |
| o | 41400 | 5.5% | |
| e | 41400 | 5.5% | |
| t | 40849 | 5.4% | |
| i | 35900 | 4.7% | |
| s | 35117 | 4.6% | |
| c | 34450 | 4.6% | |
| l | 34450 | 4.6% | |
| h | 22464 | 3.0% | |
| r | 7849 | 1.0% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 51414 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 756336 | 93.6% | |
| Common | 51414 | 6.4% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| p | 102596 | 13.6% | |
| m | 85639 | 11.3% | |
| n | 85632 | 11.3% | |
| a | 84972 | 11.2% | |
| d | 51305 | 6.8% | |
| u | 50631 | 6.7% | |
| o | 41400 | 5.5% | |
| e | 41400 | 5.5% | |
| t | 40849 | 5.4% | |
| i | 35900 | 4.7% | |
| s | 35117 | 4.6% | |
| c | 34450 | 4.6% | |
| l | 34450 | 4.6% | |
| h | 22464 | 3.0% | |
| r | 7849 | 1.0% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 51414 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 807750 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| p | 102596 | 12.7% | |
| m | 85639 | 10.6% | |
| n | 85632 | 10.6% | |
| a | 84972 | 10.5% | |
| 51414 | 6.4% | ||
| d | 51305 | 6.4% | |
| u | 50631 | 6.3% | |
| o | 41400 | 5.1% | |
| e | 41400 | 5.1% | |
| t | 40849 | 5.1% | |
| i | 35900 | 4.4% | |
| s | 35117 | 4.3% | |
| c | 34450 | 4.3% | |
| l | 34450 | 4.3% | |
| h | 22464 | 2.8% | |
| r | 7849 | 1.0% | |
| g | 899 | 0.1% | |
| v | 783 | 0.1% |
status_group
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| functional | |
|---|---|
| non functional | |
| functional needs repair | 3931 |
| Value | Count | Frequency (%) | |
| functional | 31389 | 54.5% | |
| non functional | 22268 | 38.7% | |
| functional needs repair | 3931 | 6.8% |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 12.43410085 |
| Min length | 10 |
Most occurring characters
| Value | Count | Frequency (%) | |
| n | 163643 | 22.9% | |
| o | 79856 | 11.2% | |
| i | 61519 | 8.6% | |
| a | 61519 | 8.6% | |
| f | 57588 | 8.0% | |
| u | 57588 | 8.0% | |
| c | 57588 | 8.0% | |
| t | 57588 | 8.0% | |
| l | 57588 | 8.0% | |
| 30130 | 4.2% | ||
| e | 11793 | 1.6% | |
| r | 7862 | 1.1% | |
| d | 3931 | 0.5% | |
| s | 3931 | 0.5% | |
| p | 3931 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 685925 | 95.8% | |
| Space Separator | 30130 | 4.2% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 163643 | 23.9% | |
| o | 79856 | 11.6% | |
| i | 61519 | 9.0% | |
| a | 61519 | 9.0% | |
| f | 57588 | 8.4% | |
| u | 57588 | 8.4% | |
| c | 57588 | 8.4% | |
| t | 57588 | 8.4% | |
| l | 57588 | 8.4% | |
| e | 11793 | 1.7% | |
| r | 7862 | 1.1% | |
| d | 3931 | 0.6% | |
| s | 3931 | 0.6% | |
| p | 3931 | 0.6% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 30130 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 685925 | 95.8% | |
| Common | 30130 | 4.2% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 163643 | 23.9% | |
| o | 79856 | 11.6% | |
| i | 61519 | 9.0% | |
| a | 61519 | 9.0% | |
| f | 57588 | 8.4% | |
| u | 57588 | 8.4% | |
| c | 57588 | 8.4% | |
| t | 57588 | 8.4% | |
| l | 57588 | 8.4% | |
| e | 11793 | 1.7% | |
| r | 7862 | 1.1% | |
| d | 3931 | 0.6% | |
| s | 3931 | 0.6% | |
| p | 3931 | 0.6% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 30130 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 716055 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| n | 163643 | 22.9% | |
| o | 79856 | 11.2% | |
| i | 61519 | 8.6% | |
| a | 61519 | 8.6% | |
| f | 57588 | 8.0% | |
| u | 57588 | 8.0% | |
| c | 57588 | 8.0% | |
| t | 57588 | 8.0% | |
| l | 57588 | 8.0% | |
| 30130 | 4.2% | ||
| e | 11793 | 1.6% | |
| r | 7862 | 1.1% | |
| d | 3931 | 0.5% | |
| s | 3931 | 0.5% | |
| p | 3931 | 0.5% |
| Distinct count | 57519 |
|---|---|
| Unique (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 450.0 KiB |
| POINT (37.53051463 -6.96356538) | 2 |
|---|---|
| POINT (33.00627548 -2.51995041) | 2 |
| POINT (32.91986139 -2.47667983) | 2 |
| POINT (32.9780624 -2.51532072) | 2 |
| POINT (39.09837398 -6.98360619) | 2 |
| Other values (57514) |
| Value | Count | Frequency (%) | |
| POINT (37.53051463 -6.96356538) | 2 | < 0.1% | |
| POINT (33.00627548 -2.51995041) | 2 | < 0.1% | |
| POINT (32.91986139 -2.47667983) | 2 | < 0.1% | |
| POINT (32.9780624 -2.51532072) | 2 | < 0.1% | |
| POINT (39.09837398 -6.98360619) | 2 | < 0.1% | |
| POINT (37.25011096 -7.10462503) | 2 | < 0.1% | |
| POINT (32.95652279 -2.4943533) | 2 | < 0.1% | |
| POINT (39.08628657 -6.99073094) | 2 | < 0.1% | |
| POINT (32.98856004 -2.48937845) | 2 | < 0.1% | |
| POINT (39.09906887 -6.98012199) | 2 | < 0.1% | |
| POINT (37.53277831 -6.96247516) | 2 | < 0.1% | |
| POINT (37.32890522 -7.17517443) | 2 | < 0.1% | |
| POINT (39.09138014 -6.97832237) | 2 | < 0.1% | |
| POINT (37.5433506 -6.96355665) | 2 | < 0.1% | |
| POINT (32.92601185 -2.46390984) | 2 | < 0.1% | |
| POINT (32.96573445 -2.5042939) | 2 | < 0.1% | |
| POINT (39.08596496 -6.99129411) | 2 | < 0.1% | |
| POINT (39.11921037 -6.99470401) | 2 | < 0.1% | |
| POINT (32.98478963 -2.49645868) | 2 | < 0.1% | |
| POINT (37.27435243 -7.10200368) | 2 | < 0.1% | |
| POINT (37.37401655 -7.05692253) | 2 | < 0.1% | |
| POINT (39.09206155 -6.98188419) | 2 | < 0.1% | |
| POINT (37.33981057 -7.06537264) | 2 | < 0.1% | |
| POINT (39.09851362 -6.980220399999999) | 2 | < 0.1% | |
| POINT (32.95559708 -2.50162744) | 2 | < 0.1% | |
| Other values (57494) | 57538 | 99.9% |
Length
| Max length | 44 |
|---|---|
| Median length | 31 |
| Mean length | 31.75244843 |
| Min length | 25 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 3 | 168609 | 9.2% | |
| 9 | 128743 | 7.0% | |
| 115176 | 6.3% | ||
| . | 115176 | 6.3% | |
| 0 | 108922 | 6.0% | |
| 4 | 107898 | 5.9% | |
| 1 | 107706 | 5.9% | |
| 6 | 105432 | 5.8% | |
| 8 | 103883 | 5.7% | |
| 7 | 102891 | 5.6% | |
| 5 | 102049 | 5.6% | |
| 2 | 101371 | 5.5% | |
| P | 57588 | 3.1% | |
| O | 57588 | 3.1% | |
| I | 57588 | 3.1% | |
| N | 57588 | 3.1% | |
| T | 57588 | 3.1% | |
| ( | 57588 | 3.1% | |
| - | 57588 | 3.1% | |
| ) | 57588 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 1137504 | 62.2% | |
| Uppercase Letter | 287940 | 15.7% | |
| Space Separator | 115176 | 6.3% | |
| Other Punctuation | 115176 | 6.3% | |
| Open Punctuation | 57588 | 3.1% | |
| Dash Punctuation | 57588 | 3.1% | |
| Close Punctuation | 57588 | 3.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| P | 57588 | 20.0% | |
| O | 57588 | 20.0% | |
| I | 57588 | 20.0% | |
| N | 57588 | 20.0% | |
| T | 57588 | 20.0% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 115176 | 100.0% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 57588 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 3 | 168609 | 14.8% | |
| 9 | 128743 | 11.3% | |
| 0 | 108922 | 9.6% | |
| 4 | 107898 | 9.5% | |
| 1 | 107706 | 9.5% | |
| 6 | 105432 | 9.3% | |
| 8 | 103883 | 9.1% | |
| 7 | 102891 | 9.0% | |
| 5 | 102049 | 9.0% | |
| 2 | 101371 | 8.9% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| . | 115176 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 57588 | 100.0% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 57588 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 1540620 | 84.3% | |
| Latin | 287940 | 15.7% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| P | 57588 | 20.0% | |
| O | 57588 | 20.0% | |
| I | 57588 | 20.0% | |
| N | 57588 | 20.0% | |
| T | 57588 | 20.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 3 | 168609 | 10.9% | |
| 9 | 128743 | 8.4% | |
| 115176 | 7.5% | ||
| . | 115176 | 7.5% | |
| 0 | 108922 | 7.1% | |
| 4 | 107898 | 7.0% | |
| 1 | 107706 | 7.0% | |
| 6 | 105432 | 6.8% | |
| 8 | 103883 | 6.7% | |
| 7 | 102891 | 6.7% | |
| 5 | 102049 | 6.6% | |
| 2 | 101371 | 6.6% | |
| ( | 57588 | 3.7% | |
| - | 57588 | 3.7% | |
| ) | 57588 | 3.7% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 1828560 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 3 | 168609 | 9.2% | |
| 9 | 128743 | 7.0% | |
| 115176 | 6.3% | ||
| . | 115176 | 6.3% | |
| 0 | 108922 | 6.0% | |
| 4 | 107898 | 5.9% | |
| 1 | 107706 | 5.9% | |
| 6 | 105432 | 5.8% | |
| 8 | 103883 | 5.7% | |
| 7 | 102891 | 5.6% | |
| 5 | 102049 | 5.6% | |
| 2 | 101371 | 5.5% | |
| P | 57588 | 3.1% | |
| O | 57588 | 3.1% | |
| I | 57588 | 3.1% | |
| N | 57588 | 3.1% | |
| T | 57588 | 3.1% | |
| ( | 57588 | 3.1% | |
| - | 57588 | 3.1% | |
| ) | 57588 | 3.1% |
| Distinct count | 57515 |
|---|---|
| Unique (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.149669123888835 |
|---|---|
| Minimum | 29.6071219 |
| Maximum | 40.34519307 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | 29.6071219 |
|---|---|
| 5-th percentile | 30.62360773 |
| Q1 | 33.28510016 |
| median | 35.00594322 |
| Q3 | 37.23371212 |
| 95-th percentile | 39.15049865 |
| Maximum | 40.34519307 |
| Range | 10.73807117 |
| Interquartile range (IQR) | 3.94861196 |
Descriptive statistics
| Standard deviation | 2.60742797 |
|---|---|
| Coefficient of variation (CV) | 0.07418072587 |
| Kurtosis | -0.8692761515 |
| Mean | 35.14966912 |
| Median Absolute Deviation (MAD) | 1.979294605 |
| Skewness | -0.1348112926 |
| Sum | 2024199.146 |
| Variance | 6.798680617 |
| Value | Count | Frequency (%) | |
| 33.09034738 | 2 | < 0.1% | |
| 39.08628657 | 2 | < 0.1% | |
| 39.09309544 | 2 | < 0.1% | |
| 39.09851362 | 2 | < 0.1% | |
| 37.54340145 | 2 | < 0.1% | |
| 32.98856004 | 2 | < 0.1% | |
| 32.95652279 | 2 | < 0.1% | |
| 32.98767048 | 2 | < 0.1% | |
| 32.96700926 | 2 | < 0.1% | |
| 32.99327684 | 2 | < 0.1% | |
| 39.08596496 | 2 | < 0.1% | |
| 37.53432734 | 2 | < 0.1% | |
| 31.61952953 | 2 | < 0.1% | |
| 39.09568416 | 2 | < 0.1% | |
| 39.08618257 | 2 | < 0.1% | |
| 37.25219446 | 2 | < 0.1% | |
| 32.96573445 | 2 | < 0.1% | |
| 37.37571687 | 2 | < 0.1% | |
| 37.31891128 | 2 | < 0.1% | |
| 37.37401655 | 2 | < 0.1% | |
| 32.98269806 | 2 | < 0.1% | |
| 37.54090064 | 2 | < 0.1% | |
| 39.08887513 | 2 | < 0.1% | |
| 38.34050134 | 2 | < 0.1% | |
| 39.11921037 | 2 | < 0.1% | |
| Other values (57490) | 57538 | 99.9% |
| Value | Count | Frequency (%) | |
| 29.6071219 | 1 | < 0.1% | |
| 29.60720109 | 1 | < 0.1% | |
| 29.61032056 | 1 | < 0.1% | |
| 29.61096482 | 1 | < 0.1% | |
| 29.61194674 | 1 | < 0.1% | |
| 29.61250689 | 1 | < 0.1% | |
| 29.61276296 | 1 | < 0.1% | |
| 29.61344309 | 1 | < 0.1% | |
| 29.6168718 | 1 | < 0.1% | |
| 29.61847919 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 40.34519307 | 1 | < 0.1% | |
| 40.34430089 | 1 | < 0.1% | |
| 40.32523996 | 1 | < 0.1% | |
| 40.32522643 | 1 | < 0.1% | |
| 40.32340181 | 1 | < 0.1% | |
| 40.32283237 | 1 | < 0.1% | |
| 40.32280453 | 1 | < 0.1% | |
| 40.3226251 | 1 | < 0.1% | |
| 40.32216902 | 1 | < 0.1% | |
| 40.32196593 | 1 | < 0.1% |
| Distinct count | 57516 |
|---|---|
| Unique (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.885572340514864 |
|---|---|
| Minimum | -11.64944018 |
| Maximum | -0.99846435 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 450.0 KiB |
Quantile statistics
| Minimum | -11.64944018 |
|---|---|
| 5-th percentile | -10.60147827 |
| Q1 | -8.643840785 |
| median | -5.17270373 |
| Q3 | -3.372824195 |
| 95-th percentile | -1.802689797 |
| Maximum | -0.99846435 |
| Range | 10.65097583 |
| Interquartile range (IQR) | 5.27101659 |
Descriptive statistics
| Standard deviation | 2.809876457 |
|---|---|
| Coefficient of variation (CV) | -0.477417708 |
| Kurtosis | -1.203165882 |
| Mean | -5.885572341 |
| Median Absolute Deviation (MAD) | 2.041399535 |
| Skewness | -0.2522877584 |
| Sum | -338938.3399 |
| Variance | 7.895405705 |
| Value | Count | Frequency (%) | |
| -6.97627011 | 2 | < 0.1% | |
| -6.98584173 | 2 | < 0.1% | |
| -7.05692253 | 2 | < 0.1% | |
| -6.9787555 | 2 | < 0.1% | |
| -6.95974873 | 2 | < 0.1% | |
| -6.96355665 | 2 | < 0.1% | |
| -2.46390984 | 2 | < 0.1% | |
| -7.10374232 | 2 | < 0.1% | |
| -6.98318263 | 2 | < 0.1% | |
| -2.51995041 | 2 | < 0.1% | |
| -2.52871573 | 2 | < 0.1% | |
| -6.9802204 | 2 | < 0.1% | |
| -6.98945622 | 2 | < 0.1% | |
| -2.50658954 | 2 | < 0.1% | |
| -6.95674564 | 2 | < 0.1% | |
| -7.10462503 | 2 | < 0.1% | |
| -2.51661939 | 2 | < 0.1% | |
| -2.49454559 | 2 | < 0.1% | |
| -6.96247516 | 2 | < 0.1% | |
| -6.98311512 | 2 | < 0.1% | |
| -2.49645868 | 2 | < 0.1% | |
| -9.2893492 | 2 | < 0.1% | |
| -2.51532072 | 2 | < 0.1% | |
| -6.99054864 | 2 | < 0.1% | |
| -6.9642576 | 2 | < 0.1% | |
| Other values (57491) | 57538 | 99.9% |
| Value | Count | Frequency (%) | |
| -11.64944018 | 1 | < 0.1% | |
| -11.64837759 | 1 | < 0.1% | |
| -11.58629656 | 1 | < 0.1% | |
| -11.56857679 | 1 | < 0.1% | |
| -11.56680457 | 1 | < 0.1% | |
| -11.56450865 | 1 | < 0.1% | |
| -11.56432357 | 1 | < 0.1% | |
| -11.56231592 | 1 | < 0.1% | |
| -11.56228898 | 1 | < 0.1% | |
| -11.56161898 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| -0.99846435 | 1 | < 0.1% | |
| -0.998916 | 1 | < 0.1% | |
| -0.99901209 | 1 | < 0.1% | |
| -0.99911702 | 1 | < 0.1% | |
| -0.9994692 | 1 | < 0.1% | |
| -0.99950651 | 1 | < 0.1% | |
| -0.99952232 | 1 | < 0.1% | |
| -1.00058519 | 1 | < 0.1% | |
| -1.0015208 | 1 | < 0.1% | |
| -1.00198784 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Unnamed: 0 | id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | status_group | geometry | x | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 69572 | 6000.0 | 2011-03-14 | Roman | 1390 | Roman | 34.938093 | -9.856322 | none | 0 | Lake Nyasa | Mnyusi B | Iringa | 11 | 5 | Ludewa | Mundindi | 109 | True | GeoData Consultants Ltd | VWC | Roman | False | 1999 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | functional | POINT (34.93809275 -9.856321769999999) | 34.938093 | -9.856322 |
| 1 | 1 | 8776 | 0.0 | 2013-03-06 | Grumeti | 1399 | GRUMETI | 34.698766 | -2.147466 | Zahanati | 0 | Lake Victoria | Nyamara | Mara | 20 | 2 | Serengeti | Natta | 280 | NaN | GeoData Consultants Ltd | Other | NaN | True | 2010 | gravity | gravity | gravity | wug | user-group | never pay | never pay | soft | good | insufficient | insufficient | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe | functional | POINT (34.6987661 -2.14746569) | 34.698766 | -2.147466 |
| 2 | 2 | 34310 | 25.0 | 2013-02-25 | Lottery Club | 686 | World vision | 37.460664 | -3.821329 | Kwa Mahundi | 0 | Pangani | Majengo | Manyara | 21 | 4 | Simanjiro | Ngorika | 250 | True | GeoData Consultants Ltd | VWC | Nyumba ya mungu pipe scheme | True | 2009 | gravity | gravity | gravity | vwc | user-group | pay per bucket | per bucket | soft | good | enough | enough | dam | dam | surface | communal standpipe multiple | communal standpipe | functional | POINT (37.46066446 -3.82132853) | 37.460664 | -3.821329 |
| 3 | 3 | 67743 | 0.0 | 2013-01-28 | Unicef | 263 | UNICEF | 38.486161 | -11.155298 | Zahanati Ya Nanyumbu | 0 | Ruvuma / Southern Coast | Mahakamani | Mtwara | 90 | 63 | Nanyumbu | Nanyumbu | 58 | True | GeoData Consultants Ltd | VWC | NaN | True | 1986 | submersible | submersible | submersible | vwc | user-group | never pay | never pay | soft | good | dry | dry | machine dbh | borehole | groundwater | communal standpipe multiple | communal standpipe | non functional | POINT (38.48616088 -11.15529772) | 38.486161 | -11.155298 |
| 4 | 4 | 19728 | 0.0 | 2011-07-13 | Action In A | 0 | Artisan | 31.130847 | -1.825359 | Shuleni | 0 | Lake Victoria | Kyanyamisa | Kagera | 18 | 1 | Karagwe | Nyakasimbi | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | gravity | gravity | gravity | other | other | never pay | never pay | soft | good | seasonal | seasonal | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe | functional | POINT (31.13084671 -1.82535885) | 31.130847 | -1.825359 |
| 5 | 5 | 9944 | 20.0 | 2011-03-13 | Mkinga Distric Coun | 0 | DWE | 39.172796 | -4.765587 | Tajiri | 0 | Pangani | Moa/Mwereme | Tanga | 4 | 8 | Mkinga | Moa | 1 | True | GeoData Consultants Ltd | VWC | Zingibali | True | 2009 | submersible | submersible | submersible | vwc | user-group | pay per bucket | per bucket | salty | salty | enough | enough | other | other | unknown | communal standpipe multiple | communal standpipe | functional | POINT (39.1727956 -4.76558728) | 39.172796 | -4.765587 |
| 6 | 6 | 19816 | 0.0 | 2012-10-01 | Dwsp | 0 | DWSP | 33.362410 | -3.766365 | Kwa Ngomho | 0 | Internal | Ishinabulandi | Shinyanga | 17 | 3 | Shinyanga Rural | Samuye | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump | non functional | POINT (33.36240982 -3.76636472) | 33.362410 | -3.766365 |
| 7 | 7 | 54551 | 0.0 | 2012-10-09 | Rwssp | 0 | DWE | 32.620617 | -4.226198 | Tushirikiane | 0 | Lake Tanganyika | Nyawishi Center | Shinyanga | 17 | 3 | Kahama | Chambo | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | wug | user-group | unknown | unknown | milky | milky | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | non functional | POINT (32.62061707 -4.22619802) | 32.620617 | -4.226198 |
| 8 | 8 | 53934 | 0.0 | 2012-11-03 | Wateraid | 0 | Water Aid | 32.711100 | -5.146712 | Kwa Ramadhan Musa | 0 | Lake Tanganyika | Imalauduki | Tabora | 14 | 6 | Tabora Urban | Itetemia | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | india mark ii | india mark ii | handpump | vwc | user-group | never pay | never pay | salty | salty | seasonal | seasonal | machine dbh | borehole | groundwater | hand pump | hand pump | non functional | POINT (32.71110001 -5.14671181) | 32.711100 | -5.146712 |
| 9 | 9 | 46144 | 0.0 | 2011-08-03 | Isingiro Ho | 0 | Artisan | 30.626991 | -1.257051 | Kwapeto | 0 | Lake Victoria | Mkonomre | Kagera | 18 | 1 | Karagwe | Kaisho | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | functional | POINT (30.62699053 -1.25705061) | 30.626991 | -1.257051 |
Last rows
| Unnamed: 0 | id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | status_group | geometry | x | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 57578 | 59390 | 13677 | 0.0 | 2011-08-04 | Rudep | 1715 | DWE | 31.370848 | -8.258160 | Kwa Mzee Atanas | 0 | Lake Tanganyika | Kitonto | Rukwa | 15 | 2 | Sumbawanga Rural | Mkowe | 150 | True | GeoData Consultants Ltd | VWC | NaN | False | 1991 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | machine dbh | borehole | groundwater | hand pump | hand pump | functional | POINT (31.37084807 -8.25816008) | 31.370848 | -8.258160 |
| 57579 | 59391 | 44885 | 0.0 | 2013-08-03 | Government Of Tanzania | 540 | Government | 38.044070 | -4.272218 | Kwa | 0 | Pangani | Maore Kati | Kilimanjaro | 3 | 3 | Same | Maore | 210 | True | GeoData Consultants Ltd | Water authority | Hingilili | True | 1967 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe | non functional | POINT (38.04406992 -4.27221758) | 38.044070 | -4.272218 |
| 57580 | 59392 | 40607 | 0.0 | 2011-04-15 | Government Of Tanzania | 0 | Government | 33.009440 | -8.520888 | Benard Charles | 0 | Lake Rukwa | Mbuyuni A | Mbeya | 12 | 1 | Chunya | Mbuyuni | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | non functional | POINT (33.00944043 -8.52088818) | 33.009440 | -8.520888 |
| 57581 | 59393 | 48348 | 0.0 | 2012-10-27 | Private | 0 | Private | 33.866852 | -4.287410 | Kwa Peter | 0 | Internal | Masanga | Tabora | 14 | 2 | Igunga | Igunga | 0 | False | GeoData Consultants Ltd | Water authority | NaN | False | 0 | gravity | gravity | gravity | private operator | commercial | pay per bucket | per bucket | soft | good | insufficient | insufficient | dam | dam | surface | other | other | functional | POINT (33.86685217 -4.28740983) | 33.866852 | -4.287410 |
| 57582 | 59394 | 11164 | 500.0 | 2011-03-09 | World Bank | 351 | ML appro | 37.634053 | -6.124830 | Chimeredya | 0 | Wami / Ruvu | Komstari | Morogoro | 5 | 6 | Mvomero | Diongoya | 89 | True | GeoData Consultants Ltd | VWC | NaN | True | 2007 | submersible | submersible | submersible | vwc | user-group | pay monthly | monthly | soft | good | enough | enough | machine dbh | borehole | groundwater | communal standpipe | communal standpipe | non functional | POINT (37.63405278 -6.12482968) | 37.634053 | -6.124830 |
| 57583 | 59395 | 60739 | 10.0 | 2013-05-03 | Germany Republi | 1210 | CES | 37.169807 | -3.253847 | Area Three Namba 27 | 0 | Pangani | Kiduruni | Kilimanjaro | 3 | 5 | Hai | Masama Magharibi | 125 | True | GeoData Consultants Ltd | Water Board | Losaa Kia water supply | True | 1999 | gravity | gravity | gravity | water board | user-group | pay per bucket | per bucket | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe | functional | POINT (37.16980689 -3.25384746) | 37.169807 | -3.253847 |
| 57584 | 59396 | 27263 | 4700.0 | 2011-05-07 | Cefa-njombe | 1212 | Cefa | 35.249991 | -9.070629 | Kwa Yahona Kuvala | 0 | Rufiji | Igumbilo | Iringa | 11 | 4 | Njombe | Ikondo | 56 | True | GeoData Consultants Ltd | VWC | Ikondo electrical water sch | True | 1996 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe | functional | POINT (35.24999126 -9.0706288) | 35.249991 | -9.070629 |
| 57585 | 59397 | 37057 | 0.0 | 2011-04-11 | NaN | 0 | NaN | 34.017087 | -8.750434 | Mashine | 0 | Rufiji | Madungulu | Mbeya | 12 | 7 | Mbarali | Chimala | 0 | True | GeoData Consultants Ltd | VWC | NaN | False | 0 | swn 80 | swn 80 | handpump | vwc | user-group | pay monthly | monthly | fluoride | fluoride | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump | functional | POINT (34.01708706 -8.750434329999999) | 34.017087 | -8.750434 |
| 57586 | 59398 | 31282 | 0.0 | 2011-03-08 | Malec | 0 | Musa | 35.861315 | -6.378573 | Mshoro | 0 | Rufiji | Mwinyi | Dodoma | 1 | 4 | Chamwino | Mvumi Makulu | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | shallow well | shallow well | groundwater | hand pump | hand pump | functional | POINT (35.86131531 -6.37857327) | 35.861315 | -6.378573 |
| 57587 | 59399 | 26348 | 0.0 | 2011-03-23 | World Bank | 191 | World | 38.104048 | -6.747464 | Kwa Mzee Lugawa | 0 | Wami / Ruvu | Kikatanyemba | Morogoro | 5 | 2 | Morogoro Rural | Ngerengere | 150 | True | GeoData Consultants Ltd | VWC | NaN | True | 2002 | nira/tanira | nira/tanira | handpump | vwc | user-group | pay when scheme fails | on failure | salty | salty | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump | functional | POINT (38.10404822 -6.74746425) | 38.104048 | -6.747464 |